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Elementary Proof of the Remez Inequality 


Borislav Bojanov 


This note is concerned with the Tchebycheff polynomials T(x). As well known 
they can be presented on [—1, 1] by the expression 


T(x) = cos(” arccos x). 


The famous Russian mathematician Pafnutii Lvovich Tchebycheff (1821-1894) 
introduced 7,(x) as the polynomial of least uniform norm on [—1,1] amid the 
polynomials of degree n with fixed leading coefficient. 

The Tchebycheff polynomials appear prominently in various extremal problems 
posed in 7, (the set of all polynomials of degree n). An illuminating example is the 
classical Markov inequality, which shows that 


IPO <ITl, k=0,...,n, 
for each p © 77, such that 
ll pl| == max{|p(x)|: x € [—1,1]} < 1. 


The proof of this and many other remarkable properties of T, can be found in the 
recent book of Rivlin [4]. 

It has been mentioned already by Tchebycheff that 7), is the fastest growing 
polynomial outside [—1, 1]. In other words, 


max{|p(€)l: p © 7,, Ilpll < 1} = 7,() 


for each |é| > 1. This observation provokes the following question: How large can 
a polynomial be given that it is constrained to be “small”’ on a substantial portion 
of its domain? Make the problem more precise as follows. 

Let o be an arbitrary fixed positive number. For every p © 77, define the set 


M(p) = {x €[-1,1+ a]: |p(x)| < 1}. 


Clearly M(p) consists of mutually disjoint closed subintervals. Let |M(p)| be the 
measure of M(p), i.e., |M(p)| is the total length of these subintervals. Denote 


7,(0) ={p €7,:|M(p)| = 2}. 


The problem is to characterize the polynomial p* from 7,(a) which has a maximal 
uniform norm over [—1,1 + o]. 

Evidently, the Tchebycheff polynomial T,(x) belongs to 7,(a) for each o > 0 
since |7,(x)| < 1 on [-1,1] and |7(x)| > 1 for |x| > 1. In 1936 Remez [1] 
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established the following 
sup |IDllo = IIT,,Il.., (1) 


pet,(o) 


where the supremum norm is over [— 1, 1 + a]. Of course ||T,,|l. = 7,(1 + a). The 
proof of (1) can be seen also in the book of Freud [2]. A simpler approach was 
found recently by Erdelyi [3]. We demonstrate here a short, elementary proof. 


The proof: Note that for any fixed x € [—1,1 + a] the quantity 


u(x) = sup{lp(x)|: p € 7,(0)} 


is attained for some polynomial from 7,(a7). We shall show first that u(x) < 
u(1 + o) for each x € [—1,1 + a]. Indeed, let x be an interior point of [—1,1 + 
ao] and let p be the extremal polynomial for this point, ie., p € 7,(a) and 
lp(x)| = w(x). Introduce the polynomials 


P(x) =pla(x)), p(x) = p(B(*)), 


where a: [—1,1 + 0] ~[-1,x] and B: [-1,1 +0] > [x,1 +c] are the linear 
transformations. Let M, and M, be the parts of M(p) situated in J, :=[-—1, x] 
and I, :=[x,1+ a], respectively. Assuming that |M,| <Al|J,| for i= 1,2 and 
A =2/(2+ 0) we would get |M| = |M,4+ M,| <All, + Lh] =A2Q+o)=2, a 
contradiction. Therefore |M,|/|J,] >A at least for one i, say for i= 1. Then 
|M(p,)| = 2 and hence p, € 7,(@). This yields 


u(x) = Ip(x)l = Ip +o) su +o). 
Therefore the Remez inequality will be proved if we show that ‘ 


|p.i+o)|<T7,1+6) foreach p €7,(c). 


In order to show this, denote by —1 = 7) < 9, < °°: <1, = 1 the extremal 
points of 7,. We have 
n—-k 
T,(7,) = (-1) k=0,...,n. (2) 
Let x9 <x, < ++: <x, be the points of M(p) which coincide with 7,..., 7, 


after we press M(p) to the left, i.e., to the interval [—1, M(p) — 1]. By the 
Lagrange interpolation formula 


n n — 
Ipi¢+oa)< ¥ tee 
jek 


since |p(x,)| < 1. Now taking into account the obvious inequalities |1 + o — x,| < 
l1+oa0-—y,l|, lx, —x,;l = ln, — ,| and (2), we get 
non tl+oa-y;l 
lpi+oe)|< & [T] ——— = (140). 
k=0i=0 IM — Tl 
i#k 
The proof is completed. 


The author is grateful to the referee and to the editor for their useful remarks. 
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A Note on an Identity of Ramanujan 


T. S. Nanjundiah 


In a forthcoming paper [1], Berndt and Bhargava have supplied a proof of this 
eye-catching identity of Ramanujan found in his third: notebook [3, p. 386]: if 
ad = bc, then 
64{(b+c+d)?-(a+ct+d)-(atb+d)+(a+b+c) 
| +(a—d)°-(b-c)} 
x{(b+c+d)"-(a+c+d) —(at+b+d) 
+(a+b+c) +(a-d)" —(b-c)'} 
= 45{((b+c+d)°-(at+c+d)-(a+b+d)° 
+(a+b+c)+(a- d)° —(b- c)*}". 
It figures also in their expository article [2] featuring a selected group of Ramanu- 


jan’s results. Unfortunately, they have missed its simple proof and so its genesis by 
not noticing that it is built from two sets of sums: 


u, =a, + Bit+yi, a,=b+c+d, By —-(a+b+c), ¥y, =a-d, 


U, =a, —-PBp+ yz, a,=atctd, By 


n 


-—(a+b+d), y=b-ce. 
By a; + B; + y; = 0, the underlying problem is to compute 

O, =a + BY + y", 
where a, B and y are the roots of the cubic 


z>-pz+q=0. 
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It is simple to work out an easy special case of Newton’s formulae for power sums 
of the roots of an algebraic equation. Indeed, the obvious recursion 


On+3 — POn+1 + qu, = 0 


with the initial values 


yields 
@, = 2D, w, = 2p’, 
w, = —34q, wW>, = Spq, w, = —7p"q, 
W. = 2p? + 3q’, W, = 2p" + 8pq’, 19 = 2p? + 15p’q?. 
Form the cubic whose roots are a,, B; and y;: 
z>—p,z+q,=09. 
We have 
pp=(b+ct+d)(at+b+c)+(a-d)’, 
py =(atct+d)(at+b+d)+(b-c)’, 
D, — Pz = 3(bc — ad). 
Hence p, = DP, if and only if . 
ad = bc. 
Assume this condition and set 
Pi=P,=P, A=aqi- qi. 


Now the u, = ,(p,,q,) and the v, = w,(p,, q>) given by the computed w, = 
w,( Pp, q) show that 


Uy =U; Uy, = U4) 
Ug — Ve = -3A, Ug — Vg = 8PA, Uig — Vig = 15P7A. 


So we have Ramanujan’s ingenious parametric construction of equal sums of three 
nth powers (n = 2,4), and Ramanujan’s identity. Clearly, for both these results, 
the condition ad = bc is crucial. Ramanujan must have been primarily looking for 
the first one because of its number-theoretic signifiance, the second being inciden- 
tal and apparently the only one of its kind in this context. 

For special choices of the parameters, the equal sums of three nth powers 
(n = 2,4) constructed by Ramanujan may present the same terms! This happens, 
for instance, when 


a=b(c=d), a=c(b=d), b=0=d(a#0), c=0=d(a#0). 


Barring such cases, the construction yields numbers expressible as sums of three 
nth powers (n = 2,4) in two different ways. This observation, which we owe to a 
comment of the referee /editor, does not point to any flaw in the construction for 
which what really matters is its algebraic formulation. 
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I wish to thank Professor Bhargava for having kindly shown me the proof sheets of [1] and a preprint 
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On an Identity of Daubechies 


Doron Zeilberger 


Tossing a coin (whose Pr(head) = p) until reaching n heads or n tails and 
equating the probability, 1, of finishing with the sum of the probabilities of all the 
possible final outcomes leads to 


"Ss (ati-J (a (n+i-1 | n 
r | ; |p"(1 -p)' + r | |p -p) = 
i=0 i=0 


which was proved in [1], (pp. 167-171) and [2] using Bezout’s theorem and 
induction respectively. Rolling a k-faced die instead leads to the multivariate 
generalization 


y > (a, +++: +a;-,+(n- 1) +4;4, + °°: +a,)! ’ 


a,! eee a;_!(n —_ 1)!a;,,! eee a,!} 


n 


DU + De PED! DEY = 1, 


provided p, + °:: +p, =1. 
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Data Compression 


Catherine C. McGeoch 


Every object stored in a computer, whether an integer, the text of the Oxford 
English Dictionary, Or a digitized image of the Mona Lisa, must first be encoded 
into a sequence of 0’s and 1’s (called bits). Alphabetic characters are usually 
represented according to either the ASCII (ask-ee) or the EBCDIC (ib-se-dic) 
standard code. For example, ‘“‘A’’ is encoded 01000001 in ASCII and 11000001 in 
EBCDIC. 

Suppose you want to store the text of Far from the Madding Crowd by Thomas 
Hardy. The book contains 768,771 characters: since both standard codes use 8 bits 
(one byte) per character the book would occupy slightly over half of a 3.5 inch 
floppy disk. Methods of data compression can be applied so that Hardy’s book 
requires an average of 2.48 bits per character [1], thereby reducing the storage 
requirements by a factor of three. 

Samuel Morse used a form of data compression in the design of his famous 
code. The frequently used letters have short sequences (E and IT are ° and —° ), 
and the less common letters have long sequences (Y and Z are —*~— and ——:° ). 
Although an alphabet of 30 characters requires 3.26 =[2°-1+4:°2+ 8-3+14 
- 4/30 bits per character on average (using two 1-bit codes, four 2-bit codes, and 
so on), we might expect that a message in Morse Code would be shorter than 
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average because letters with short codes appear frequently. In this column we shall 
examine a data compression scheme that produces optimally-short encodings. 

First, some definitions. An alphabet is a finite set of characters. We will denote 
a special alphabet B = {0,1}. A word is a finite sequence of characters from some 
alphabet. (Although examples in this column will use “natural English” words, this 
need not be the case in general.) A message is a sequence of words. A code C isa 
one-to-one onto function mapping a set of source words W = {w,,W>,,...,W,} 
from some alphabet to a set of code words {b,,...b,} from B. We encode a 
sequence of source words by applying C to each word in sequence. We decode a 
message by applying the inverse function C’ to the coded words. 

A prefix code is one in which no code word is a proper prefix of another. Prefix 
codes are desirable because it is easy to break a coded message into words when 
decoding. Figure 1, for example, shows two codes for a word set W. Code C, isa 
prefix code and C, is not. There is no ambiguity decoding M = 1111100110 
according to C,, but decoding with C, produces (at least) two different source 
messages. 


P W C; C, 
40 not 110 11 

35 save 00 11111 
14 the 01 001 
.06 trust 111 111 
05 queen 10 10 


Figure 1. Two codes for the same set of source words. The first is a prefix code, the second is not. 


Let us assume that the source words W = {w,,...,w,} appear in source mes- 
sages according to some fixed probability distribution P = {p,,...,p,}. For a 
particular code C, let 1(C,i) denote the length (number of bits) in C(w,). The 
expected word length in a random coded message is therefore L(C) = X7_,p; ° 
I(C, i). Given W and P, how shall we construct an optimal prefix code having 
minimum expected word length? 

Good question. Is C, an optimal prefix code for the probabilities given in 
Figure 1? Can you find a better code? 


HUFFMAN CODES. In 1952 D. A. Huffman developed an elegant and efficient 
method for constructing optimal prefix codes given W and P. He did this by 
building an encoding tree, which is a binary tree such that every node j has an 
associated cost c; and has either 2 or 0 children. Each source word w, is 
represented by a /eaf node i in the tree having cost assigned such that c,; = p,. Left 
branches in encoding trees are labeled 0 and right branches are labeled 1. 

Every prefix code C is represented by an encoding tree T.. In Figure 2, for 
example, the encoding C,(queen) = 10 is found by reading edge labels downward 
from the root to the leaf labeled “queen”. The prefix property is ensured because 
no word is an ancestor of another in the tree. 

The depth d, of node i is its distance from the root. The weighted path length of 
leaf node i is a, = p,:d;. The average path length PL(T,) of the tree is found by 
summing weighted path lengths over leaves and is therefore equal to the expected 
word length L(C) of the code. In Figure 2, the “queen” node has depth 2 and 
weighted path length .10. The average path length for this tree is 2.46. 
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Figure 2. An encoding tree for code Cj. 


Huffman’s tree construction method works as follows. 


1. Begin with a list of one-node trees corresponding to words W,...w, and 
having costs c,...c, equal to p,... p, respectively. - 

2. Repeat the next two steps n — 1 times: 

3. Find two trees Q and R in the list having smallest costs c, and c, at their 
root nodes (breaking ties arbitrarily). Remove them from the list. 

4. Construct a new tree S as follows. Make a new root node s having Q as its 
left subtree and R as its right subtree. The cost of node s is c, =c, + ¢,. 
Add tree S to the list. 


At the end of this process the list will contain a single encoding tree, called a 
Huffman tree. We shall prove that any Huffman tree has minimal average path 
length. 


Theorem. Let T, be a Huffman tree constructed for a given set of words W and 
probabilities P. Then for any encoding tree T constructed on W and P, PL(T,) < 
PL(T). 


Proof: The proof is by induction on the number of leaves in 7,. If 7, has one or 
two leaves the encoding tree is unique and the proof is trivial. 

Suppose 7, has n > 2 leaves and let nodes i and j be the nodes of minimal 
cost that were selected in the first step of the construction. These are necessarily 
leaf nodes in T, and they are necessarily siblings (having a common parent node). 
Construct a new tree 7; containing n — 1 leaves by removing i and j: their 
common parent node x becomes a new leaf having cost c, = c; + c;. Since the 
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only effect of this modification is to move the combined cost c, one level closer to 
the root, we have PL(T,) = PL(T;) + c,. 

Let w, be formed by the concatenation of w; and w, and let p, = p; + p,. Then 
T;, is a Huffman tree constructed on the word set W' = W — {w,,w,} + {w,} with 
probabilities P’ = P — {p,, p;} + {p,}. 

Now construct a tree 7’ from T by removing the same nodes i and j and 
replacing them with a new node y. Since T is an encoding tree for W these are 
necessarily leaf nodes, but they are not necessarily siblings in 7. We have two 
cases: 


1. If ¢ and j are siblings then remove them and form a new leaf node from the 
parent x exactly as was done for T,. The new tree 7” is an encoding tree for 
W’ and P’, and PL(T) = PL(T’) + c,, 

2. If i and j are not siblings then adjust the tree to make them siblings: if i has 
greater depth than j exchange j with the sibling of i, and if j has greater 
depth than i exchange i with the sibling of j (if they have equal depth then it 
doesn’t matter which gets exchanged). That is, suppose d; > d ; and that 
node s is the sibling of i. Detach the subtree having root s and move it (up) 
to j’s place in the tree, and move j (down) to s’s location in the tree. (The 
case d; < d, is handled similarly.) 

Moving j increases its depth by 6 = d; —d,; and increases the average 
path length of the tree by 6c,;. But every leaf node that is moved along with s 
has its path length decreased by 65: since j was chosen to have minimal cost 
(except possibly for i), the net effect of the exchange operation cannot be an 
increase in average path length. Letting 7, denote the new tree, we have 
PL(Ty) < PL(T). : 

Now Case 1 holds. Remove nodes i and j and replace with x as above to 
form T’ from TJ). Then PL(T’) + c, = PL(T)) < PL(T). 


By the induction hypothesis PL(T;) < PL(T’). Combining this with the above 
inequalities completes the proof. O 


OTHER CODES. One practical problem with Huffman’s Code is that either the 
probabilities P must be estimated beforehand or the text to be encoded must be 
pre-scanned to determine word frequencies. This leads to inefficiencies in either 
the length of coded messages or in the time required to encode messages. A 
dynamic code C allows the mapping of source words to code words to change “on 
the fly’ as the message is being encoded. Some methods (most notably Lempel-Ziv 
encodings) modify W dynamically as well as C. The compression factor of three 
mentioned earlier for Hardy’s text is achieved by a dynamic method that combines 
several compression ideas [1]. 

Some codes are specialized for data other than (English) text. A digitized image 
of the Mona Lisa, for example, will tend to have long sequences of identical source 
words (which represent colors and intensities). Run-length encoding maps a se- 
quence such as yyyyyyyyybbbbbbgerrrrrrrrrr into a sequence of pairs 9y, 6b, 2g, 10r 
which may be compressed further. - 

For a detailed discussion of static and dynamic Huffman codes and of the 
Lempel-Ziv method, see Lewis and Denenberg [2]. Lelewer and Hirshberg [3] 
provide an extensive and detailed survey of several data compression schemes 
along with some experimental comparisons. Several methods for data modeling 
with applications to text compression are surveyed by Bell et al. [1]. 
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It is the consensus of opinion among college teach- 
ers of mathematics (See J. Seidlin, Mathematics 
Teacher, Dec. 1932) and science that the secondary 
schools produce graduates with the following general 
characteristics: 


(1) Worn out or weary of mathematics, 

(2) No inspiration for individual investigation, 

(3) No appreciation of accuracy, 

(4) Not able to place a decimal point in its proper 
place, | 

(5) Direct and inverse proportions are meaningless. 


—American Mathematical Monthly 
40, (1933) p. 382 


1993] COMPUTER SCIENCE SAMPLER 497 


PROBLEMS AND SOLUTIONS 


Edited by: 
Richard T. Bumby, Fred Kochman and Douglas B. West 


Proposed problems should be sent to the MONTHLY PROBLEMS address given on 
the inside front cover. Please include solutions, relevant references, etc. Three copies 
are requested. 


Solutions of published problems should arrive before October 31, 1993 at the 
MONTHLY PROBLEMS address given on the inside front cover. Solutions should be 
typed with double spacing, including the problem number and the solver’s name and 
mailing address. Two copies suffice. A self-addressed postcard or label should be 
included if an acknowledgment is desired. 


An asterisk (* ) after the number of a problem, or part of a problem, indicates that 
no solution is currently available. Partial solutions will be useful in such cases. 
Otherwise, the published solution is likely to be based on a solution which is complete 
and correct. Of course, an elegant partial solution or a method leading to a more 
general result is always useful and welcome. In addition, references to other 
appearances of MONTHLY problems or to solutions of these problems in the 
literature are also solicited. 


PROBLEMS 


10306. Proposed by Seung-Jin Bang, Seoul, Korea. 


Find all positive integers n such that the polynomial 
a"(b—c) + b"(c —a) + c"(a —b) 
has a? + b? +c? + ab + bc + ca as a factor. 


10307. Proposed by John Calvin Williams, student, and I. Martin Isaacs, University 
of Wisconsin, Madison, WI. 


Can one construct a set Z of finite groups satisfying the two conditions: 

i. Z contains precisely one representative from each isomorphism class. 

ii. If A € & is isomorphic to a subgroup of B € &, then A is a subgroup 
of B. 


10308. Proposed by Robert Connélly and John H. Hubbard, Cornell University, 
Ithaca, NY, and Walter Whiteley, York University, North York, Ontario, Canada. 


Suppose that p,, P>,P3>4149,93 are six points in the plane and that the 
distance between p, and q, (i,j = 1,2,3) is i +j. Show that the six points are 
collinear. 
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10309. Proposed by Walter Rudin, University of Wisconsin, Madison, WI. 


Compute 


1 oo 
exp| 5— [lout + Bcos 6) dé 


when A> B>O. The answer should be given as an algebraic function of A 
and B. 


10310. Proposed by E. Rodney Canfield, University of Georgia, Athens, GA. 


Fix an integer r > 2. Using Stirling’s formula we may find constants c, and c, 
such that 


m 
(v7 _ €i(¢2) 
m m!'/2 

as m — ©, Prove that the ratio (7 \mnl/ */c™ is an increasing function of m for 
m> 1. 


10311. Proposed by Solomon W. Golomb, University of Southern California, Los 
Angeles, CA. 


It is well-known that if g is a primitive root modulo p, where p > 2 is prime, 
either g or g + p (or both) is a primitive root modulo p? (indeed modulo p* for 
all k > 1). 

(a) Find an example of a prime p > 2, and a primitive root g modulo p with 
1 <g <p such that g is not a primitive root modulo p”. 

(b) Show that, among all é(p — 1) primitive roots g modulo p with 1 < g < p, 
at least half of them are also primitive roots modulo p?. 


10312. Proposed by Hongyuan Zha, IMA—University of Minnesota, Minneapolis, 
MN. 


Let c and s be non-negative real numbers satisfying c* + s* = 1. Prove that, for 
n> 1, 


s’-7V14+¢ 


is the second smallest singular value of the n by nm upper triangular matrix 


1 -c -c —c 
1 -c "7 = € 
T,(c) = diag(1, s,---,5"~*) ee 
1 -c 
1 


10313. Proposed by O. Krafft and M. Schaefer, Rheinisch-Westfalische Technische 
Hochschule, Aachen, Germany. 


Let a €[—1/5, 1) and let 2, denote the set of random variables X satisfying 
a < X < 1. Show that 


max{ EX?EX* — (EX)": X € 2,\ =2-§ 
if and only if a € [—1/5,1/2]. 
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NOTES 


Notes: (10311) The multiplicative group modulo the power of an odd prime is 
always cyclic, and the term primitive root is the traditional name in elementary 
number theory for a generator of this group. Fundamental properties can be found 
in textbooks such as I. Niven, H. S. Zuckerman and H. L. Montgomery, An 
Introduction to the Theory of Numbers (fifth edition). A consequence of (b) is that 
for every prime p > 2 there is at least one g with 1 < g < p which is a primitive 
root modulo p* -for all k > 1. (10312) The matrix T,(c) is a well known example in 
numerical linear algebra. More details can be found in G. Golub & C. Van Loan, 
Matrix Computations. It should be noted that there is no simple expression for the 
smallest singular value of T,(c). 


SOLUTIONS 


Solving the Velocity Composition Equation of Special Relativity 


6659 [1991, 445]. Proposed by Abraham Ungar, North Dakota State University, 
Fargo, ND. 


Let R2 be the subset of the Euclidean 3-space R? given by the equation 
R? = {x € R?: |x| <c}, 


where c is a positive constant. In the special theory of relativity c represents the 
speed of light, and the elements x of R? are admissible velocities. The relativistic 
velocity composition law is given by the equation 
x+y 1 Y x X (x X y 
~_—*FY Ste ERY) x,y € R3, 
1+x-y/c ci y+1 14+x-y/c 


where y, is the Lorentz factor 


i a 
yl-x-‘x/c 


It is known that the space R? is closed under the relativistic velocity composition: 
if x,y € R? then x*y € R?. 
For given a,b € R? solve each of the two velocity composition equations 
a*x =b (1) 
and 
x*a=b (2) 
for the unknown x € R?. 
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Solution by Rolf Richberg, RWTH Aachen, Aachen, Germany. For x € R?, 
x/c © R? and this rescaling is compatible with the definitions of * in R? and R}?, 
so it suffices to consider R}. In this case y, = (1 — x - x)~'/*. Elementary vector 
algebra then yields that 


Vx 1 1 


*y = + ——_—_—— _ |x +. ——_—_—___-y A 
1+ ¥x(1 + x+y) Yx(1 + x+y) ) 
K+ (x*y) = 1 — —————_— B 
(xy) yg (1 + x+y) 8) 
Yxxy = VxYy(1 + x+y) (C) 
(—x) *(x*y) =y (D) 
for all x,y € R?. The special case 
2 
x*x = ————_x 
1+x:x 


of (A) is also worthy of note. It is now a simple matter to solve equation (1). From 
(D), we know that a*((—a)*b) = b, which shows that x = (—a)*b is a solution to 
(1). On the other hand, (1) implies that (—a)*b = (—a)*(a*x) = x. Thus (1) has 
the sole solution x = (—a)*b. 

A slightly greater effort is required to solve equation (2). Assuming that x is a 
solution of (2), (C) yields y,(1 + x + a) = y,/y,. Then (A) gives 


» = 2e% [4 - 1 ct Yb 4 
Mb 1+ y, y,(1 + a+ x) y,(1'+ a+ x) 
Yx 
— 1+ Vx (Yb + Ya)X + Yaa. 


Now, let 
Ye Ypb — Yaa 
1+ ¥% Yo + Ya 
In view of v: v = (y, — D/(y, + D, we have y, = (1 + v« v)/(1 — v- v) and 
2 
x = ————_-V = VV. 
l+v-v 


On the other hand, given a,b € R}, define v = (y,b — y,a)/(y, + ¥_) and x = 
v«v. Then, using (C), we get 


1+ Yo x 
veyed —2— 
(Yp + Ya) 
Also 
" 1—-v-v\- 
x°x=1 | <1 
1l+veVv 
yields 
1+viv Vx 
Vx = and v= Xx 
1-v'v 1+ 


1993] PROBLEMS AND SOLUTIONS 501 


Now, by (C) 


2 Vas 1 
2a-v= ——[,| ° i) -»{1- 5]} 
Yp + Ya YaY b Ya 
— 20 + Yawn) 
Ya Yb + Ya) 
Y 
=—(1-y- v) -1-v-vy, 
and hence 
1+v-v 2 Yb 
1+a-+x) = ——([1+ — —a:v] = —, 
Yx(1 + a+ x) | l+v-v Va 


which, with (A) yields 


x*a = [1 + Jy + v8 a =b. 
Yb Yb 
This settles the case of equation (2). These formulas: x = (—a)*b in (1); and 
= v«v with v = (y,b — y,a)/(y, + y,) in (2) use only expressions preserved by 
the mappings used to rescale c. Hence they are valid for all c > 0. 


Editorial comment. The proposer’s proof is contained in his paper, “Thomas 
precession and its associated grouplike structure’, Am. J. Phys. 59 (1991), 824-834, 
which explores the abstract algebraic properties of addition of velocities in special 
relativity. In particular, weak versions of associative and commutative laws can be 
found which enable equations (1) and (2) to be solved by operations resembling 
those used in associative algebras. 

Thomas N. Delmer approached the problem by analogy to the use of quater- 
nions to study rotations in Euclidean 3-space. The matrix 


Ve = ,x"/C 
—yx/o I+ (y— 1)xx™/|xl? 
describes the left action of x on columns occurring as first columns of T(y) for 
y © R°. The solution of equation (1) follows from the fact that T(x)~' = T(—x). 
To solve equation (2), one linearizes the problem by writing T = CC’ where C isa 
matrix whose inverse is its complex conjugate C and whose entries depend linearly 
on four real parameters. The equation 7'a = b then takes the form C’a = Cb, 


which is a system of linear equations in the parameters defining C. This use of the 
matrix C corresponds to the vector v in the solution above. 


T(x) = 


Solved also by R. J. Chapman (U.K.), T. N. Delmer, S. Eder (student, Austria), M. Golomb, T. L. 
McCoy, K. MclInturff, and the proposer. Two incorrect solutions were received. 


An Aperiodic Sequence 


E 3457 [1991, 754]. Proposed Py Herbert S. Wilf, University of Pennsylvania, 
Philadelphia, PA. 


Find all positive integers k such that the sequence 


(ca) 


is periodic modulo k from some point onward. 
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Solution by Jerrold R. Griggs, University of South Carolina, Columbia, SC. The 
only such values of k are 1 and 2. The case k = 1 is trivial, while for k = 2 the 
familiar binomial coefficient recursion and symmetry yields 


(27) = (27-1) 4 (on ')- 2(°" > ; ') = 0 (mod 2) 
for all n > 1. 

Next let kK = 4. By the binomial theorem, (> "| (mod 4) is the coefficient of x” 
the expansion of (1 + x)?” over Z,[ x]. We claim that (1 + x)?” = 1+ 2x?" 4 es 
over Z,[x] for m = 1. This is immediate for m = 1 and readily verified for m > 1 
by induction on m by multiplying out (1 + x)?”” = ((1 + x)*”)*. Hence (2") = =2 
(mod 4) when n = 2/ for j > 0. On the other hand, if n > 3 is not a power of 2, 
say n = 2! +r where 0 <r < 2/, we obtain over Z,[x] that 


(L+x)" = (14x)? (L4 x)" = (14 207% 42?" )(1 +x)”. 


Since 2r <n < 2/*!, the only contribution to the coefficient of x” is 2("), which 


Qitl 


is divisible by 4 since (2") is divisible by 2. Hence {(2: "| is not eventually periodic 
mod 4, as it contains arbitrarily long finite stretches of zeroes modulo 4. 
Next SUPPose that k is an odd prime p. Since p| (? for all 0 <i < p, we have 


(1 + x)? = 1+ x? over Z,|x]. By induction, it follows for m > 1 that +x)?" 
1+ x”". Hence for 0 <j ‘< p™ the coefficient of x‘ in (1 + x)?" +) is 1 for i J 
and i=p™ but 0 for j <i <p”. Also, the coefficient of x?” in (1 + x)??" 
1 mod p. Again, the sequence {(2") mod Kk contains arbitrarily long finite stretches 
of zeroes and cannot be eventually periodic. 

Each remaining value of k is divisible by an odd prime or by 4; call this divisor 
d. The sequence cannot by eventually periodic mod k, else it would be eventually 
periodic mod d as well, which we have shown cannot happen. 


Solved also by R. J. Chapman (U.K.), P. CiZek (student, France), M. Dindos (Slovakia), R. B. 
Eggleton (Brunei), N. J. Fine, I. Gessel, R. Holsager, I. Kastanas, K. S. Kedlaya (student), N. Komanda, 
O. P. Lossers (The Netherlands), D. Magagnosc, I. Nemes (Austria), A. Nijenhuis, A. Pedersen 
(Denmark), B. Peterson, N. G. Randolph, I. Vardi, Con Amore Problem Group (Denmark), and the 
proposer. 


Subsets Whose Sums Are Congruent 


E 3472 [1991, 956]. Proposed by Hunter Snevily, California Institute of Technology, 
Pasadena, CA. 


Suppose / and &k are relatively prime positive integers and n = h + k. Show 
that for each j there are h-1(" 1 ‘| k-element subsets of {1,2,...,” — 1} with sum 
congruent to j modulo h. 


Solution by Richard Holzsager, American University, Washington, DC. We trans- 
form the problem slightly. For each k-element subset A = {a,,...,a,} of {1,..., 
n — 1}, labeled so that a, < --: <a,, define a k-element sequence f(A) = 
(b,,...,5,) by b, =a; —ifor1 <i<k.ThenO0<b,< :-:: <b, <h-—1,and f 
is a bijection between the subsets and the nondecreasing k-element sequences 
bounded between 0 and h — 1. Since we have reduced the sum of each set by a 
fixed amount (k(k + 1)/2), it suffices to show that the number of sequences with 
sum congruent to j mod h is independent of j. 
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Consider such a sequence B. If we replace each b, € B by b, + 1 mod h, then 
we add & to the sum. If b, =h — 1, then to remain in the specified set of 
sequences we must also replace h’s by 0’s (and cyclically reorder), which does not 
change the sum modulo h. Since k is relatively prime to h, applying this injection 
h times leads us back to the original set through all the congruence classes modulo 
h, so each contains the same number of sequences. 


Editorial comment. The proposer and the Anchorage Math Solutions Group 
applied a similar cyclic rotation to the original sets, viewed as subsets of an 
n-element set. Reiner Martin applied the properties of the g-nomial coefficient. 


Solved also by G. Calinescu (Romania), R.J. Chapman (U.K.), M. Dindos (Slovakia), R. Martin 
(student), the Anchorage Math Solutions Group, and the proposer. 


A Ratio with a Cauchy Distribution 


10189 [1992, 60]. Proposed by Ignacy I. Kotlarski, Oklahoma State University, 
Stillwater, OK. 


Suppose (X,, X,) and Y are two independent absolutely continuous random 
variables, where (X,, X,) has a distribution depending only on X? + X} and Y 
has an arbitrary distribution. Let Z = (X, — X,Y)/(X,Y + X,). Show that Z has 
a Cauchy distribution. 


Solution by Kenneth Schilling, University of Michigan, Flint, MI. For r > 0, let 
g(r) be the density of (X,, X,) at a point (x,, x) with x? + x3 =r’. By changing 
to a form of polar coordinates, X¥, = Rsin © and X, = Rcos © with -7 < O <7, 
we have 


P(0, <0 < 6,) 


[? fee) dr d@ 
0, 70 
0, — 4, 

27 


Thus © is uniform on (—7, 77), so that tan ® is a Cauchy random variable. 

Now let ® = arctan Y (so that ® is a random variable on (—7/2, 7/2)). Then 
Z = tan(@ + ®), 

For any fixed real number ¢, © + @ is uniformly distributed modulo 7. Hence, 
for fixed real numbers a and b, 


P(a < tan(@ + d) <b) = P(a < tan(@) <b). 
Since © and ® are independent, we have 
P(a<Z <b) =P(a <tan(@ + ®) <b) 


= ["" P(a < tan(® + $) <b) dFo(4) 
— 17/2 


= P(a < tan®@ <b) 
and so Z has a Cauchy distribution. 


Editorial comment. Most solvers used a similar argument, and many noted that 
the absolute continuity if Y is irrelevant. José Luis Palacios employed a result 
from B. C. Arnold and P. L. Brockett, “On distributions whose component ratios 
are Cauchy”, The American Statistician, 46 (1992), 25-26, and Gérard Letac 
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referred to G. Letac, ‘““Which functions preserve Cauchy laws’, Proc. Amer. Math. 
Soc., 67 (1977), 277-286 and G. Letac, “Isotropy and sphericity: some characteri- 
zations of the normal distribution”, Annals of Statist., 9 (1981), 408-417 for more 
general work related to this problem. 


Solved also by J. A. Bucklew, D. Callan, R. J. Chapman (U.K.), S. Gleason, E. Hertz, T. Hesterberg, 
N. Kang (student, Korea), K. S. Kedlaya (student), G. Letac (France), A. Nijenhuis, J. L. Palacios, 
D. M. Rosenblum, R. Stong, Anchorage Math Solutions Group, and the proposer. Two incomplete 
solutions were received. 


An Interval of Differences 


10190 [1992, 61]. Proposed by Peter J. Ferraro, Roselle Park, NJ. 


Suppose ¢ is a positive integer congruent to 1 modulo 4 but not a perfect 
square. Put a = (1 + yt)/2. 
(a) Prove that if n is a positive integer, then 


1<|a*n|—-|alan|| < [el]. 
(b) Does every integer in the interval [1,|a@|] occur as such a difference for 
some positive integer 7. 


Solutions by John Henry Steelman, Indiana University of Pennsylvania, Indiana, 
PA. We prove part (a) and show that the answer to part (b) is “yes”. To get these 
results, let 6, = an —|an|. The one-dimensional case of Kronecker’s theorem 
(due to Jacobi—see J. F. Koksma, Diophantische Approximationen, Springer, 1936, 
Theorem I.5, p. 10) shows that {0,} is dense in the interval (0, 1) for irrational a. 

Now let ¢ and a be as in the statement of the problem. If ¢ = 1+ 4r, then a. 
straightforward calculation yields a2 =a+r. Thus a2n =an+rn and hence 
| a?n|= [an| + mm. It follows that 


(a — 1)[an| = (a —- 1)(an - 6,) =m —- (a —- 1)6,. 
Adding | an| to each side of this equation yields 
alan| =|an|+m—(a—1)6, =|a*n| — (a — 1)8,. 

Thus we conclude that {|@?n] — a|an]} is dense in the interval (0, a — 1). As 
|a?n| —|alan||= [| a2] — al an|| = [(@ — 1)6,], we see that the set of such 
differences consists of those integers in the interval [1, | @ |]. 

Solved also by D. Callan, R. J. Chapman (U.K.), J. Fukuta (Japan), B. Haible (Germany), R. 


Holzsager, K. S. Kedlaya (student), O. P. Lossers (The Netherlands), R. Stong, B. M. M. de Weger 
(The Netherlands), O. Wyler, University of South Alabama Problem Group, and the proposer. 


Collaborating editors: David F. Appleyard, Paul T. Bateman, Bruce C. Berndt, 
Duane M. Broline, Barry W. Brunson, Frank S. Cater, Gulbank D. Chakerian, 
Underwood Dudley, Gerald A. Edgar, Michael A. Filaseta, Ira M. Gessel, Richard 
A. Gibbs, Jerrold R. Griggs, Douglas A. Hensley, John R. Isbell, Mourad E. H. 
Ismail, Murray Klamkin, Daniel J. Kleitman, Frederick W. Luttmann, Frank B. 
Miles, Richard Pfiefer, Stephen L. Portnoy, J. O. Shallit, John Henry Steelman, 
Kenneth B. Stolarsky, David E, Tepper, Douglas B. Tyler, Daniel Ullman, Edward 
T. H. Wang, and William E. Watkins. 


Answer to Picture Puzzle: 
(p. 488) 


Both Ivan Niven and Lida Barrett have been president of the MAA. 
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Thomas Archer Hirst— 
Mathematician Xtravagant 
II. Student Days in Germany 


J. Helen Gardner and Robin J. Wilson 


Yesterday evening about 30 members of the Halifax Mechanics Institute and Mutual Improve- 
ment Society took tea together at Stott’s Temperance Hotel, Broad Street, for the purpose of 
presenting a testimonial of respect to Mr Thomas A. Hirst, assistant to Mr Carter, land surveyor, 
who is about leaving the town. Mr Hirst has been an active voluntary teacher in the above society 
for upwards of 35 years, and has won the esteem and respect both of the directors and members, 
especially those of his own class, having taught the higher branches of mathematics with great 
ability. 


After completing his apprenticeship on 31st August 1850, Thomas Hirst “bid adieu 
to surveying” forever. Remembering his earlier brief visit to Germany, and 
attracted by John Tyndall’s enthusiasm for the University of Marburg, he resolved 
to study there. Tyndall was about to return to Marburg, and so Hirst went with 
him, arriving on 10th October after a delightful three-day journey visiting the 
Rhine. . 

Hirst quickly established for himself a daily routine which combined studying 
the German language with indulging his love of literature and pursuing his various 
scientific activities: 


18th October 1850: ...My time is divided thus:- I rise and get my breakfast eaten before 8 a.m. 
then smoke a cigar and begin the day by one of those fast earth-bound unenthusiastic essays of 
Montaigne. This I do medicinally to discipline myself for the practical labours of the day, then 
from 9 to near 1 p.m. I work in the laboratory... Dinner, and afterwards German translation 
until dusk (53), then a walk until lamp time, then German again (with an interval for tea) until 93; 
from that time to 10 my journal occupies me generally, and from 10 to 10.30 as I said Tyndall 
and I sweeten the day’s labour with a poem. 


Hirst matriculated at the University on 2nd November 1850. Being uncertain of the 
exact direction which his studies should take, he decided to pursue the three 
sciences of chemistry, physics and mathematics. His hope was that, by attending 
lectures in these areas, he would be able to “make choice which of the three 
should form the subject of my future and more particular study”. 

Just as Tyndall had done previously, he attended the lectures of Robert Bunsen 
on chemistry, Christian Gerling on physics, and Friedrich Stegmann on mathemat- 
ics. He was most enthusiastic about a laboratory session of Bunsen, but seemed 
rather less enthusiastic about the lectures of the other two: 


5th November 1850: ... From 9 to 12 at laboratory. All the students began their practical course, 
the lectures don’t commence until Thursday. Bunsen however, was present all the time and 
moved about from one to another, in a way that does one’s heart good; able man as he is 
everywhere acknowledged to be, there is not the least spark of pride in him, his disposition 
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Tyndall’s description of the University 

Our University is not grand, it is broken into parts and presents no imposing front. Our 
laboratory presents rather a scoundrel-like appearance, but don’t conclude hastily against it— 
it holds a man [Bunsen] whose superior as a chemist is not to be found within a radius of 8000 
miles from the Piece Hall of Halifax. There, however, right over against me on the summit of a 
hill, with the sun shining upon its white walls, and its tower piercing the air, is a fine 
building—an astronomical observatory and physical institute, its interior furnished with costly 
apparatus; on the other hand I can lead you into a little room with hacked rickety benches, 
perhaps the whole not worth five and sixpence, where a man of genius makes his hearers forget 
the poorness of his furniture, as he crushes the crust of a mathematical calculation between his 
fingers. 
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which shines through his face is a model of gentleness, geniality and integrity and humility, he is 
universally beloved here, and in his presence all feel at home and encouraged. 

12 to 1 with Gerling. He is an old man with a good deal of the pedant about him, of weak 
concentrative intellect, and as is usual too much vanity. With all that however, he is what is 
generally called a good-hearted old boy. 

From 3 to 4 with Stegmann who differs again materially from the rest. His appearance is not 
prepossessing, he is an ordinary looking little man with however, a sharp nose, pale studious 
face, and deep sunk eyes. He bolts into the room and into his mathematics at one and the same 
time, wasting no time either in prelude or wordy introduction. There is a figure chalked on his 
black-board almost before you are aware he is present, he talks in a slow distinct voice, carves 
his subject deliberately piecemeal and at his exit as at his entrance you are just considering and 
in the middle of his last equation when you find that he has bolted, shut the door and not a 
vestige even of his coat flap is visible. The idea presents itself to you that if you were to follow 
him the moment you missed him, you would find him buried in a mathematical problem in his 
own room. When you come to know him, however, you find him a thorough good fellow, who 
always pretends less than he intends performing. 


Daily his German became more fluent and his understanding more reliable. His 
attendance at lectures helped with this, and by mid-November Hirst began to 
notice the improvement himself: 


19th November 1850: ...I find that with Stegmann I am learning more German than with any 
other. He reads mathematical operations for us to copy in writing. At first I could not copy a 
word, then occurred a space of time when most terrible and exasperating blanks occurred. Now 
only a few blanks in a page perhaps... 


The turn of the year showed that Hirst had, as earlier in Halifax, quickly settled 
himself comfortably into a new community, and by the Spring he felt quite at 
home. It came as a great disappointment when Bunsen announced his impending 
departure from Marburg: 


5th April 1851:...Bunsen called on me. He is a kind 
fellow indeed, during the last 2 months I have been 
working at the quantitative analysis of some minerals from 
Iceland, and he has been at great pains in explaining a 
theory of his as to their formation, by which theory the 
calculated and analysed composition shew a remarkable 
agreement. My analyses are a further proof of the veracity 
of his law, and he, thinking that some publicity would be 
of service and acceptable to me, proposed to me to write a 
small notice of my analyses and calculations for “Liebig’s 
Annalen”; nay more in spite of his extreme business just 
now as in 2 or 3 days he leaves Marburg he has sketched 
out an article and to-day brought it to me. As to my share 
in the investigation it has been so commonplace that I Robert Bunsen (1811-1899) 
should certainly refuse to publish any such article. Viewed, 
however, as a corroboration of his work, it will extend the 
speed of his researches and so J do it. As for the kindness 
tome, it was well meant, though if he knew me better he 
would not have offered it. 


He increasingly gained satisfaction from his mathematical work, frequently to the 
detriment of his other subjects: 


15th June 1851: ...I could do nothing well but mathematics, this week. Physics or chemistry or 
general literature were as arrows, that could find no entrance through my mathematical coat of 
mail, but glanced off merely... 
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Never content to relax, he worked up to sixteen hours a day, and this soon began 
to affect his health. Overwork and lack of physical exercise began to cause 
intestinal problems that were to affect him throughout his life, and he suffered an 
attack of dropsy. His dissatisfaction with his life style, and his frustration with his 
lack of progress, frequently spill over into the pages of his diaries. 


27th July 1851: Many a time this week have I cursed this inward shrinking at intellectual 
obstacles, subjects pass by me skimmed, not penetrated into; and in spite of the day’s proper 
number of hours having been devoted to one’s task, at their close is no satisfaction. Sometimes I 
cry to myself, “‘Is it not possible to get thyself absorbed in thy work, Tom—fully?” and if not, “Is 
success possible?” To which I can but answer, “Thy Duty is to do thy work, with or without 
absorption therein; therefore, go about it instantly.” Patience, therefore, more energetic work, 
and action is my need. Then from the feeling of recreation earned, the latter also will react on 
health and strength. At present my work and recreation are both accompanied with too little 
physical exercise. That, therefore, is the point to be attacked. 


His life style even affected his social activities: 


10th August 1851: On Tuesday evening at Museum, at a ball in the gardens. The night was chill, 
I dropped too suddenly from Differential Calculus into ladies’ society, and could not give myself 
freely to the change. After an hour’s unsuccessful attempt so to do, I returned, cursing the mode 
of life I was pursuing; next morning I had already shaken hands, however, with Diff. Calculus, 
and forgot the ladies... 


He found relaxation in reading Tennyson and Carlyle, and translating into English 
the works of Goethe and Schiller: 


12th October 1851: My days have been thus divided: up at 7, breakfast and Schiller until 8, then 
mathematics until 12.30, a walk from that time to 1, then dinner and Schiller or ‘Leader’ until 
2.30. Once more mathematics until 5, then Physics until 7; from 7 to 8 tea and Schiller, from 8 to 
11 translations of Schiller, from 11 to 12 cigar and Schiller, then to bed. 


It was around this time that the direction of his future studies began to emerge 
more clearly. Believing that “if the heart is not in the work, there is poor chance 
either of success therein, or of steady perseverance”, he found the idea of 
concentrating on mathematics increasingly compelling: 


14th December 1851: ... After waverings and experiments every day brings with it the stronger 
conviction that my labour, in which I must find my daily discipline and duty, must be in the 
mathematical field. Many a time have I asked myself “what then is the absolute value of being 
expert at addition and subtraction? Did I come into the world to be an animated Ready-reck- 
oner merely?” Such questions occur daily more seldom, dim visions of a higher destiny have long 
floated before me, as God forbid they should ever cease to do. But they have brought with them 
heretofore not merely a disturbance of the concentration necessary bravely to fulfil the day’s 
duty, but also scattered energy to fulfil any work and even a morbid depreciation as to the value 
of all work itself. These dim visions of a higher density are like too full sails—dangerous, when 
the proportionate ballast is not there... I begin to get a gleam that there is a higher value in the 
multiplication table than that which teaches us that twice two make four. The Ready-reckoner 
even may have its transcendent side... 


Life by now was incomparably better than it had been, although festivals were still 
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celebrated with his books, and social occasions were usually overshadowed by 
work: 


28th December 1851: A different Christmas to any I ever spent before has again passed by. This 
time I had no one to share it with except Brandes Analyt: Geom: and Boucharlat’s Mechanics, 
both which, however, if not merry, were at least interesting companions. 


31st December 1851: To-night there is a ball in the Ritter. I am seated at my table at the window 
investigating the properties of an Ellipsoid. The music comes across the Ketzerback, mellowed 
by the gusts of wind—it is as if Nature had turned my room into a flute and breathed soothing 
harmony through it. All this serves as an accompaniment, almost unconsciously so, to my work. I 
have not been out for two or three days, and not the slightest idea that the New Year was on the 
threshold and the Old Year nearly dead: when suddenly my neighbour St Elizabeth announces 
the fact by tolling 12 times—simultaneously outside, where all was before in stillness, the air 
rings with cries of “Prost Neu Jahr!” ...My pen fell from my hand, and the whole past year 
stood before me with wondrous vividness. It has been an eventful one to me—filled with 
manifold new and instructive experiences. More foothold I do possess than before, so hail to 
thee, New Year. “Have at you,” as boxers would say. 


Hirst’s description of Marburg 

... Marburg stands on the inner apex of an acute angle in the Lahn valley, which is a river 
running nearly North and South to the Rhine. Marburg stands then on the west bank, and the 
river flows past it with a graceful sweep into a quiescent broad hill-encircled valley to the South. 
It was near sunset, with a beautiful sky and a wind just strong enough to make the dying leaves 
sing musically and take their last and only flight high into the air before they sunk to their final 
rest... Immediately before us was Augusta’s Ruhe and a little farther Marburg Castle on hills of 
about equal height their slopes carefully terraced into rich looking gardens. To the west the sun 
was sinking behind the far distant purple hill between which and us was the most graceful 
alternation of hills and valleys with their red fallows, green, beautiful green meadows and brown 
woods. The spires of the church rose tapering in calm religious ascension, and the grand old 
castle, looked over all, with its most resigned and reverend glance. Marburg thou art indeed set 
in the midst of a fairy land! 


By March 1852, he was completely at home in Marburg, commenting that “Germany 
and Germans are now to me as a native land and brothers, whereas the year 
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before there was ever a feeling of strangeness present’. He had determined to 
complete his studies there during the vacation, but a letter from John Tyndall 
changed his plans entirely. 


1st March 1852: ... An unconscious notion possessed me before that haste was needed and that 
it was time I left Germany. How the notion came I know not. True it is, however, when John said 
if he were in my place he would be in no haste the idea struck me as new, and in an hour I had 
made other plans, namely quite silently and unknown to any of them I will walk in and visit 
England, and as quickly and quietly return to Berlin or Gottingen at the beginning of next 
Semester. This arrangement will give me an opportunity to proceed far further with my 
Mathematics, and to hear some of the first German mathematicians; after which time I may sit 
down more confidently to a dissertation... 


However, Stegmann advised him to take his oral examination before leaving for 
home and then to return to complete his written dissertation. His oral examination 
took place on 16th March 1852, and covered physics (the motion of a pendulum, 
acoustics and light), crystallography and chemistry, and mathematics. It is interest- 
ing to note the areas covered by a mathematics student at that time: 


...we went through part of the theory of Equations, namely the solution of general and 
numerical equations of higher powers—also the principal methods of elimination with two or 
more unknown terms. From this he turned to the theory of Curved Surfaces, principally on the 
Tangential Plane and Euler’s Law of Curvature. He did not ask a single question in Differential 
or Integral Calculus, for which I was sorry. Instead of that, he asked finally a question in 
Descriptive Geometry, for which I was not so well prepared... After-a close examination of two 
hours, however, I was ordered to retire and in a few minutes was recalled, when the Decan told 
me they were satisfied, and that as soon as the necessary dissertation was approved I should 
receive my Degree... 


He could now prepare for his brief visit to England. 


17th March 1852: After packing up my traps I went round to say good-bye to Professors and 
friends. Congratulations met me on every hand. They were mostly sincere, too, and as I had 
earned them I received them willingly. Shortly before dusk I took a walk towards Wertha, as I 
did yesterday evening before my examination. Then to prepare myself for the coming trial—-now 
to cogitate on my past year and a half’s work... Another phase of my life is concluded, and 
thank God, it is an improvement on the foregoing. Here, however, it must not and shall not rest 
—it is but the beginning of new and better directed activities... 


After returning to Marburg, he began work on his Ph.D. dissertation, “On 
conjugate diameters of the triaxial ellipsoid”. By mid-June, he was able to write: 


13th June 1852: ...the neck of my dissertation is already broken. Last Christmas, as I told you, I 
looked round the matter and spent the last part of a week thereon. Since I returned from 
England I have stuck pretty closely to it for ten days, and it is now done. The thing is small, it is 
true, but I have Stegmann and Schell’s authority when I say it is a neat little investigation; both 
of them kindly offered to give me any assistance they could, but I did not require it... 


However, it was not all plain sailing. In particular, he had a lot of trouble trying to 
simplify one complicated, but important, expression. Even Wilhelm Schell, his 
“quick, brilliant and impulsive” supervisor, was unable to help. But a few days 
later, Hirst was successful: 


One morning at 5 a.m. in bed a thought struck me in reference to this identical expression, to 
interpret whose significance had baffled me for two days. Acting on the hint, I got up, washed 
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myself from top to toe, and walked into it until dinner time. It was one of the luckiest hints that 
ever came to me—obstacle after obstacle tumbled before it, and in two days after the whole 
problem, much to my surprise as well as that of Stegmann and Schell, was solved. The same day 
Stegmann came to sit an hour with me, not having seen me for more than a week. ‘How do you 
get on with the Dissertation?’ he asked. ‘I think I have done it, Professor,’ I replied. He put on 
one of his half sarcastic, half sceptical smiles, and asked me to shew him it. I did so. He made no 
remark at all, until I had finished—then from his countenance one could scarcely interpret an 
approval—-the careful dog—He then called me back very pertinently to a few important parts I 
had explained badly: and at length expressed his entire satisfaction saying he could suggest no 
improvement. I have just translated it into German roughly, and Schell is kindly correcting it for 
me... 


Ueber conjugirte Diameter 


im dreiaxigen Ellipsoid 


FRBAVEVEAL-BIGGERTATION, 


welche 


mit Genehmigaung der philosophischen Facultdl = Thomas Hirst’s Ph.D. dissertation 
Marb 2 Eri der Doct peared . : . . . 
OPED PLO EL ELIE “On conjugate diameters of the triaxial ellipsoid” 


einreiclit 


THOMAS ARCHER HIRST 
aus England 


MARBURG, 


Drack und Papier ven Joh. Aug. Hoeh. 


18 5 2. 


The dissertation was quickly approved, and Hirst was awarded his doctorate in July 
1852. Like his friend Tyndall, he had completed his studies within two years, 
instead of the usual three. 


3rd July 1852: I received orders to-day to attend upon the University Decan, Prof. Bergk, which 
I obeyed. It was to tell me that my dissertation had been approved of by the Philosophical 
Faculty, and upon delivering 120 printed copies to the University I should receive my Diploma. I 
took the MS. therefore, immediately to Printer Kock. 

I learnt afterwards from Professor Stegmann that it first went to Prof. Gerling and his written 
Opinion on the accompanying form was to the effect: “I find the dissertation good, and have only 
a few suggestions with respect to order and other trivial matters to make; I think it, however, 
advisable for Prof. Stegmann to certify publicly that Mr Hirst has made it without his help”!!! The 
poor old fellow, I suppose, felt slighted that after hearing his lectures on Trigonometry I 
declined making him my mathematical tutor. Stegmann certified accordingly that the dissertation 
was completed before he saw it—indeed, he might have added that he was not in Marburg when 
it was written. 

I have received an invitation to become a member of the Mathematical Krantzchen [circle] 
with Professors Stegmann, Gerling, Hessel, etc. Doctors Kohlrausch, Schell, etc. to be held 
weekly in the open air. 


11th July 1852: On Monday evening I attended the Mathematical Krantzchen in Prof. Hessel’s 
garden. It is an interesting meeting indeed. Stegmann, with his keen intellect and quiet sarcasm, 
Gerling with intense vanity and essential insignificance, Hessel with his reserve and stubborn 
gruffness, and Schell with his unpretending, brilliant suggestions make by their contrast an 
interesting study... 
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For some time Hirst had resolved to visit Gottingen and Berlin to learn from the 
greatest German mathematicians of the day. In the former, he would work on 
magnetic experiments with Weber, and visit Gauss; in the latter, where he was to 
spend the winter semester, he would become a good friend of both Dirichlet and 
Steiner. His account of this exciting time in his life forms the topic of the next 
article. 
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‘“‘Is he hardier than a small forest? 
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How To Make Wavelets 


Robert S. Strichartz 


§1. INTRODUCTION. The French call them ondelettes, these new high-tech 
gadgets in the arsenal of harmonic analysis. Move over, Fourier! Your series and 
transforms are not the only game in town. Wavelet expansions enjoy a number of 
good properties not available in other types of expansions. To see this in the 
simplest context, consider a real-valued function f(x) on the interval [0,1]. You 
can expand it in a Fourier series 


f(x) =by) +  (b, cos2arkx + a, sin27kx) (1.1) 
1 
or you can expand it in a Haar function series 
co =2/—] 
f(x) =o + yd Cin (2/x —k) (1.2) 
j=0 k=0 


where (x) is the function defined by 
1 if0<x< + 
W(x) = )-1 if4<x<1 (1.3) 
0 otherwise. 
(see FIGURE 1). 


Figure 1. The graph of the generator of the Haar functions. 
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Both series are examples of expansions in terms of orthogonal functions in 
L7*(0, 1). Thus there are simple formulas for the coefficients. (Exercise: Show that 
{ys(2’x — k)} are orthogonal, but not normalized.) But the Fourier series is not 
well localized in space; if you are interested in the behavior of f(x) on a 
subinterval [a,b] you need to involve all the Fourier coefficients. On the other 
hand, the Haar series is very well localized in that to restrict attention to the 
subinterval [a, b] you need only take the sum in (1.2) over those indices for which 
the interval I, = [2-/k,2~/(k + 1)] (the support of w(2/x — k)) intersects [a, b]. 
Furthermore, the partial sums of the Haar series (Summing 0 <j < N) clearly 
represents an approximation to f taking into account details on the order of 
magnitude 2~” or greater. These two properties, localization in space, and scaling, 
are the hallmarks of wavelet expansions. In addition, the Haar functions are 
created out of a single function w by-dyadic dilations and integer translations. 
Essentially the same property is shared by all the wavelet bases we will discuss, and 
may in fact be taken as an approximate definition of a wavelet expansion. 

The wavelet expansions we are going to construct can be thought of as 
generalizations of the Haar series, in which the function w& is replaced by smoother 
cousins. Before we can say exactly what properties we want these functions to 
have, and how we can go about constructing them, it is useful to backtrack and see 
exactly how the Haar functions arise. It will turn out to be easier if we consider the 
whole line as the domain of our functions. 


§2. THE ROUGH-AND-READY HAAR WAVELETS. We begin with the function 
©» = characteristic function of the unit interval [0,1]. Surely this is one of the 
simplest functions one can imagine, but it is chosen because it has two important 
properties: . 

(i) the translates of @ by integers, p(x — k), k © Z, form an orthonormal set 
of functions for L7(R); 

(ii) ¢ is self-similar. If you cut the graph in half then each half can be expanded 
to recover the whole graph. This property can be expressed algebraically by the 
scaling identity 


p(x) = (2x) + p(2x — 1). (2.1) 


We will call @ the scaling function. (In the French literature it is sometimes 
called “le pére” and w is called “‘la mére,”’ but this shows a scandalous misunder- 
standing of human reproduction; in fact the generation of wavelets more closely 
resembles the reproductive life style of an amoeba.) In fact, the scaling identity 
essentially determines » up to a constant multiple (exercise). The significance of 
the scaling identity is the following: Let V, denote the linear span of the functions 
g(x —k), k © Z (or by abuse of notation the closure in L*(R) of this span, 
Ee__a,¢(x — k) with Lla,|* < ©). This is a natural space to consider in view of 
(i), since the functions g(x — k) form an orthonormal basis for V). Of course Vo is 
not all of 17, it is the subspace of piecewise constant functions with jump 
discontinuities at Z. We can get a larger space by rescaling. Let (1/2)Z denote the 
lattice of half-integers k/2, k € Z, and let V, denote the subspace of L? of 
piecewise constant functions with jumps at (1/2)Z. It is clear that f(x) € V, if and 
only if f(2x) € V,, and the functions 2'”*@(2x — k) form an orthonormal basis for 
V, (the factor 2'/* is thrown in to make the normalization ||2'/7@(2x — k)\l2 = 1 
hold). The scaling identity (2.1), or rather its translated version 


p(x —k) = (2x — 2k) + (2x — 2k - 1) (2.1) 
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says exactly V, C V,, since a basis for V, is explicitly represented as linear 
combinations of basis elements of V,. (Of course the containment V, C V, is clear 
from the description of the spaces V, and V, in terms of locations of jump 
discontinuities, but in the generalizations to come there will be no such simple 
description; however, there will be a scaling identity.) 

The whole story can now be iterated, both up and down the dyadic scale. 
The result is an increasing sequence of subspaces V; for j € Z, where V, consists 
of the piecewise constant L* functions with jumps at 2~/Z, and the’ functions 
2//*9(2!x — k) for k € Z form an orthonormal basis for V,. We can pass back and 
forth among the space V, by rescaling: f(x) € V, if and only if f(2* 4x) E V,, and 
the scaling identity (2.1); suitably rescaled, says V; CV, if j < k. The sequence {V;} 
is an example of what is called a multiresolution analysis. There are two other 
properties of {V;} that are significant, namely 


a V, = {0}, (2.2) 
and 
L) V, is dense in L? (2.3) 
jez 
(exercise). 


In view of (2.3) it would seem tempting to try to combine all the orthonormal 
bases {2//*p(2/x — k)} of V, into one orthonormal basis for L’(R). But look, 
although V, C V,, ,, the orthonormal basis {2//*p(2/x — k)} for V, is not contained 
in the orthonormal basis {20*%/*p(2/*!x — k)} for V,,,. (indeed, there are 
distinct elements in the two orthonormal bases that are not orthogonal to each 
other.) So our first naive attempt to obtain an orthonormal basis for L7(R) is 
flawed. Can we fix it up? 

Back to the drawing boards! Since V, ¢ V, and we have an orthonormal basis 
for V, of the form {g(x — k)}, why don’t we try to complete an orthonormal basis 
of V, by adjoining functions of the form {w(x — k)} for some function w? This is 
the same thing as asking for an orthonormal basis of the desired form for the 
orthogonal complement of V, in V,, which we denote W,, so V,=V,® W, 
(Hilbert space direct sum). 

The answer is easy: we want to take & exactly to be the Haar function generator 
defined in §1. Note that ~ can be expressed in terms of ¢ by 


W(x) = p(2x) — v(2x — 1) (2.4) 


which is very reminiscent of the scaling identity. Exercise: show that {W(x — k)} 
forms an orthonormal basis for W. But now we can rescale the space W,, so 


V..=V,eW, (2.5) 


J J J 


and {2/7 2W(2ix — k)}, ez is an orthonormal basis for W,. If we combine conditions 
(2.2), (2.3) and (2.5) we obtain 


L(R)= © W, (2.6) 


J= —co 


and since the spaces W, are all mutually orthogonal we can now refine our naive 
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attempt and combine all the orthonormal bases for W, into one grand orthonormal 
basis {2//7(2/x — kez, xez for L’(R). (The only change is that we have 
replaced the scaling function ¢ by the wavelet w.) This gives the Haar series basis 
for the whole line. There is a minor variation on this theme that is perhaps more 
closely related to the Haar expansion on the unit interval: instead of (2.6) we can 
also write 


L?(R) =VY, ® 


® W) (2.6') 


and then combine the basis {g(x — k)}, <7 for V) with the bases {2//7(2!/2x — 
k)}, <7 for W, with j = 0, to obtain an orthonormal basis for L’(R). 


§3. MULTIRESOLUTION ANALYSIS. The moral of the story so far is that we 
first want to build a scaling function g and associated multiresolution analysis 
- €V_,c¢V,CV,C -:: before constructing the wavelets. 


Definition. A multiresolution analysis +--+ CV_,;CV,CV,C °:: with scaling 
function ¢ is an increasing sequence of subspaces of L?(R) satisfying the following 
four conditions: 

(i) (density) U ,V;, is dense in L’(R), 

(ii) (separation) 1 ,V; = {0}, 

(iii) (scaling) f(x) € V, @ fQ2/x) € Vy 

(iv) (orthonormality) {p(x — k)}, <7 is an orthonormal basis for Vo. 
It follows easily from the definition that-{2//*p(2/x — y)}, <7 forms an orthonor- 
mal basis for V;. Since ¢ € V, CV; we must have 


o(x) = Li a(y)e(2x — y) (3.1) 


yEZz 


for some coefficients a(y) satisfying 


X lacy) = 2 (3.2) 
and in fact 
a(y) = 2 o(x)e(2x — ¥) dx. (3.3) 


Equation (3.1) is the analogue of (2.1), and we will refer to it as the scaling identity. 

It follows from the definition that the scaling function determines the multireso- 
lution analysis, but not conversely. A more difficult question is how to characterize 
those functions ¢ which are scaling functions for a multiresolution analysis. Here 
we expect the scaling identity to play a crucial role, but before we can say more we 
need to examine certain algebraic conditions on the coefficients a(y) that follow 
from the definition. 

First, there is a consistency condition that arises from (iv) and (3.1). We know 
from Civ) that 


[o(%— y)@(%) dk = 6(y,0) (3.4) 


(Kronecker 5). If we use (3.1) to substitute for g(x — y) and g(x) in @G.4) we 
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obtain 


EX aly jay’) [o(2x — 27 - y' ex — y") ae 


yeEZ aw A 


=27' YY a(y')a(y") = 6(y, 0) 


yl =2y+y’ 


after the change of variable x — 27'x and use of (3.4). We rewrite this as 


L a(y')a(2y + y') = 28(y,0). (3.5) 
y EZ 
Note that (3.5) contains (3.2) as a special case. 

Another algebraic condition arises if we assume ¢ is integrable and {g(x) dx # 0 
(if {p(x) dx = 0 then the same is true for all functions in all V;, so we would not 
expect to have the density condition (i)). Then we integrate (3.1) and make a 
change of variable to obtain 


fol) & = X a(y) fe(2x — y) as 


= Y a(y)27! fe(x) ax 


yEZz 


hence 


yi a(y) = 2. (3.6) 
yEZ 
Now we would like to reverse the procedure. Step 7 will be to produce solutions 
a(y) to the algebraic identities (3.5) and (3.6). Step 2 will be to define the scaling 
function via the scaling identity (3.1). Notice that (3.1) says that ¢ is a fixed point 
of the linear transformation 


Sf(x) = LL a(y)f(2x - y) (3.7) 


yEZz 
so it is reasonable to try to construct ¢ by iterating S, 
yo = lim S"f (3.8) 


n> 
for some reasonable initial function f. In a later section we will discuss another 
method for solving (@.1). Step 3 will be to prove that the function g that solves 
(3.1) (normalized so |lg||2 = 1) generates a multiresolution analysis. This is the 
trickiest step, because there are simple counterexamples to show that it is not 
always true (try a(y) equal to 1 for y = 0,3, and otherwise a(y) = 0, and 
= X0,3p Which violates (iv)). Nevertheless, many choices of a(y) do yield a 
multiresolution analysis. The difficult condition to verify is the orthonormality Civ), 
and we will have to postpone the discussion of when and why this holds to a later 
section. In Box 1 we will show how to establish the density (i) and separation (ii), 
given orthonormality and the additional normalization condition 


[o(x) dx =1. (3.9) 


Now we are ready to move on to Step 4, which is the construction of the 
wavelets themselves. 
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Proofs of Density and Separation 


Lemma B1.1. Let V, be any subspace of L?(R) which is contained in L*(R) 
and which has the property that 


filo <ecllflle forall fe Vp. (B1.1) 


Define V, by the scaling condition (iii) (no assumption of the sort V; C V,,, is 
necessary.). Then (ii) holds. 


Proof: The scaling condition and a simple change of variable transforms 
(B1.1) into 


fll. < em//*|| fll, for all f € V,. (B1.2) 


If fe OV, then (B1.2) holds for all j, and letting j > — we obtain 
Il fll. = O hence f = 0. QO.E.D. 


The estimate (B1.1) is easy to obtain in our case. For simplicity assume 
is bounded and has compact support, which will be the case in all our 
examples. Then by the orthonormality (iv) we have 


f(x) = L ols ~ y) { F(v)eQ — y) ay = [K(x y) f(y) 
where K(x, y) = Lye zPlx — ye(y — ¥); SO 
Fools (fk Pa) iifle=(Zlee— nF) Wh 
yEZz 


and Le 7lp(x — y)|* is uniformly bounded (of course much weaker condi- 
tions on ¢, such as rapid decrease will also imply this). 


Lemma B1.2. Assume g has compact support and satisfies (3.1) and (3.9), and 
the orthonormality condition (iv). Then the density condition (i) holds. 


Sketch of Proof: Let P.f(x) = 2/L, -7g(2/x — DPOQry — y) dy denote 
the orthogonal projection onto V;. We need to show lim, _,,. P,f = fin L’ for all 
f € L’, which is equivalent to lim joolP f WS = IFN by the Pythagorean theo- 
rem. It suffices to prove this for f = v aA any interval, by a density argument. 
But ||P. 4ll3 = 27L, 2 fapl2/y — y) dy? = 27¥, <,| fi Ag(y — ) ay|’. For 
large j, 2’A will be a large interval, so essentially either /,; A gly — y) dy = Oif 
y € 2/A or foi ,0(y — y) dy = 1lify € 2/A by (3.9) (for y in a small neighbor- 
hood of the boundary of 2/A this is not quite correct, but in the limit we can 
ignore this detail). Thus ||P, 4ll2 = 2~#{y © 2/A} =~ length(A) = |lx ila and 
in the limit this becomes equality. Q.E.D. 


Notice that we could essentially reverse the argument to deduce the 
necessity of the normalization condition (3.9). 
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§4. THE WAVELETS. We will consider the scaling function ¢ to be the first 
element » = wb, of a pair of functions Wo, #,, with , being the wavelet generator. 
We would like the functions {W(x — y)},ez, ,=0,; to be an orthonormal basis for 
V,. Since the functions {g(2x — y)}, 7 already form an orthogonal basis for V,, 
the functions (x) and (x) must be linear combinations of g(2x — y), so they 
must satisfy an identity 


W(x) = » a,(y)p(2x — y), k=0,1 (4.1) 


yEZ 


which generalizes (3.1) (of course a,(y) = a(y)). Notice that for k = 1 (4.1) is an 
explicit formula, there is nothing to solve. But what kind of conditions should we 
put on the coefficients a,(y)? The same reasoning that led to (3.5) leads to 


7 


Lv a(y')a,(2y + y') = 26(j, k) (7, 0). (4.2) 


yEZ 


On the other hand, the condition {g(x) dx # 0 is not something we can expect to 
hold for &, (think of the example of Haar functions), so conditions (3.6) can only 
be recopied in our new notation 


Le 4o(y) = 2. (4.3) 


yEZz 


Lemma 4.1. If {g(x — y)}, <7 is an orthonormal set and if a,(y) satisfy (4.2) and 
(4.3) then (h(x — Wh, ez, K=0,1 is an orthonormal set. 


Proof: It suffices to show 


[Yj C)C% — ) de = 87, k)8( 7,0). (4.4) 


Now 


[Ui()u(a - y) dk = Le L aj(y')ag(y") fe(2x ~ y')e(2x = 2y — 7") de. 


y" EZ 


But the integral is (1/2)6(y’,2y — y’) by the orthonormality of g(x — y) so (4.4) 
reduces to (4.2). Q.E.D. 


Remark. We have omitted the justification of the interchange of series and 
integrals, but in most of the examples we will look at the series are actually finite 
sums. 

Thus {%,(x — y)},ez, ~=0,1 iS an orthonormal set of functions in V,. Is it a 
basis? (A kind of pseudo dimension counting argument makes this very plausible.) 
To show that it is a basis it suffices to represent each function g(2x — y) as a 
linear combination, and we know the coefficients will have to be 


[o(2x - D(a - y) ae = La) [e(2x — Pex - 2y — 7) ae 
—— 
= 5 ay — 2y). 
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Thus we need to show that 


1 oe 
> ub a,(y — 2y)&(x — y) (4.5) 


2 K=0,1 yeZ 
is equal to (2x — y). But if we substitute (4.1) into (4.5) we obtain 


1 ee 
» 5 ~ Yia,(2y' + ¥)a,(2y' + vy) |e(2x - y) 
yveZ\“ k=0,1y'eZ 


so it suffices to show 


Ly bh a,(2y' + Z)a,(2y' + y) = 267,79), (4.6) 
k=0,1 eZ 


for y = Oor 1. 


Lemma 4.2. (4.6) always holds, hence {i,(x — Wh, er, =0,1 4s an orthonormal basis 
for V,. 


Although this is a purely algebraic statement, we postpone the proof until the 
next section. 


Theorem 4.3. Suppose generates a multiresolution analysis and a,(y) 
satisfy (4.2) and (4.3) with yb, defined by (4.1) and ty = y. Then the functions 
{2/2 (2/x — y)} forj € Z, y € Z form an orthonormal basis of L*(R). 


Proof: As before, let W, denote the orthogonal complement of V, in V,, V, = 
Vo ® Wo. We claim {W(x — y)}, <7 is an orthonormal basis for W). This follows 
because we have merely taken the basis for V, given by Lemma 4.2 and removed 
{W(x — y)}, 7 which is a basis for Vj. By scaling we obtain 

Vi4,=V,0W, 


J J J 


and 
{2//7yp,(2/x ~ Y)hyez 
is an orthonormal basis for W;. But 
L-(R) = @W, 
jeZ 


by the density condition. Q.E.D. 


As a simple variation on the theme, which we leave as an exercise to the reader, 
the set of functions {g(x — y)} for y € Z together with {2//*#,(2’x — y)} for 
j = 0, y € Z form an orthonormal basis of L*(R). The advantage of this variant is 
that we scale only to finer and finer resolutions (j — +) and take care of all the 
coarser resolutions (j < 0) by the single family {g(x — y)}, <7. 

In summary, we have reduced the construction of wavelets to the solution of the 
algebraic identities (4.2) and (4.3), modulo some technical conditions to ensure the 
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orthonormality condition (iv). Step 5 will be to actually produce the solutions to 
(4.2) and (4.3), and Step 6 will be to establish various properties of the wavelet 
functions: regularity, decay at infinity, and moment conditions. 

The reason we have postponed some of the details in the construction so far is 
that they require a new technique. So it is now time to open the door and invite 
Fourier back in. 


§5. THE VIEW FROM THE FOURIER TRANSFORM SIDE. Suppose we take the 
Fourier transform of everything in sight. Because most of our identities have a 
convolutional structure, we expect a simplification, with multiplicative identities 
arising in their place. Before doing so, let us return to the orthonormality question, 
because here the Fourier transform viewpoint gives us an entirely new handle on 
the problem. Given g € L’, how can we tell from ¢ whether or not {g(x — Why eZ 
is orthonormal? 
It will simplify matters if we adapt the convention (as in [SW]) that 


G(x) = fe?™o(y) ay (5.1) 
so that the Fourier inversion formula is just 
G(x) = e(-x) (5.2) 


and the Plancherel formula is 
elle. = Welle (5.3) 


(warning: not all the references follow this convention!). 


Lemma 5.1. {g(x — y)}, <z is an orthonormal set if and only if 


¥ l6(é +) =1 forall é. (5.4) 


yEZ 
Proof: By the Plancherel formula, {g(x — Whe 7 is orthonormal if and only if 
ferm®|6(E) |" dé = 8(y, 0). (5.5) 


But the integral over R can be broken up into an integral over [0, 1] and a sum over 
Z. Since e*7"£” is periodic we obtain 


frern l9(é + y)/ dé = 8(y,0) 


yEZ 


which means that the function L, -7/¢(é + y)|* on [0,1] has as Fourier coeffi- 
cients 5(y, 0), hence must be the constant function given by (5.4). QO.E.D. 


Now the scaling identity (4.1) transcribes easily into the condition 


Wil E) = Ax(2€)6(48) (5.6) 
where 
1 
A,(€) = > Le an(yyern" (5.7) 
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(exercise, using the definition of the Fourier transform and a change of variable). 
Notice that A,(é) is smooth and periodic. Then (4.3) says 


A,(0) =1 (5.8) 
and (3.9) says 
$(0) = 1. (5.9) 


By iterating (5.6) for k = 1 (remember /, = ¢) we obtain the infinite product 
representation 


G(E) = 11 40(2~%) (5.10) 


(using (5.8) we can justify the local uniform convergence of the infinite product). 
Substituting (5.10) back into (5.6) we obtain 


i(8) =A, (38) rH Ag(2). (5.11) 


Thus the functions A, completely and explicitly determine the wavelets. 

The most intricate part of the transcription process is the identity (4.2) that the 
coefficients a,(y) must satisfy. What does this tell us about the functions A,? 
Rather than deal with this question directly (try it as an exercise, after the fact) we 
repeat the process which led to (4.2)—namely the consistency of (4.1), alias (5.6), 
with the orthonormality, alias (5.4). In other words, if {p(x — y)}, <7 is orthonor- 
mal then (5.4) must hold, and if (5.6) defines #, then we want "the analogue of 
(5.4), namely 


DY bE + Vb(E + vy) = Se. (5.12) 
vyEZ 


Now let 7, = 0 and 7, = 1/2. These are representations of the cosets of the 
subgroup Z in (1/2)Z. Then points of the lattice Z can be represented uniquely as 
2(y + y,) as y varies in Z and p = 1,2. Then 


Dv bE + yb(é + y) = Ld (E+ Wy + n,))b,(€ + 2(y + np)) 


by the above parametrization of Z, and if we substitute (5.6) and use the 
periodicity of A, we obtain 


2 —_— 
Ye A, (2 + 0,)4 (42 + 0,) L [6(4E + + )I- 
p=1 yEZz 


The inner sum over Z yields the constant 1, and so (5.12) yields the consistency 
condition 


2 — 
LACE + mp )A(E + 1p) = Sx- (5.13) 
p=1 
This is the Fourier transform equivalent of (4.2). Note that (5.13) implies 
|A,(€)| <1 (5.14) 


which implies the boundedness of the Fourier transforms Wy. 
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We can now easily supply the missing proof of Lemma 4.2. Notice that (5.13) 
says that for every é, the 2 x 2 matrix {A,(é + 7,)} is unitary by rows. But 
this is equivalent to being unitary by columns, 


dD AE + 7,)Ap(€ + 1g) = Spq- (B2.1) 
k=0,1 
Now substituting (5.7) into (B2.1) we obtain 


14 
’ Qriyn, p2Tiy'(n,—- Qriyé _ 
Es LL aly + vagy perrrveravirmn | ernie = 6, 
yew k=0,1 y€Z 
Regarding this as an identity between Fourier series expansions we can 
equate coefficients to conclude 


1 ————_- sat 
5 LL aly + vay errre?'Vr-m0 = 8,,5( 7, 0). 
k=0,1 eZ 


Choosing 7, = 0 and summing over q we obtain (4.6) for y = 0 since 


Tg 2miy ty = 2 ify’ e2Z 
qul Q otherwise. 


Similarly, choosing 4, = 1/2, multiplying by e?™'"4 and summing over q we 
obtain (4.6) for y = 1. 


The time has-come to grasp the bull by the horns and prove the orthonormality 
of {p(x — y)}, <7 directly. For this we will need an additional hypothesis. 


Theorem 5.2. Suppose 
A((é) #0 for |é| < =. (5.15) 


Then {o(x — Whye 7 is orthonormal. 


Proof: We construct a sequence of functions yg, such that {pox — y)},<7 is 
orthonormal, and such that ¢, > ¢ in L? norm as j > ©. For g, we simply take 
GE) = X/-12,1/26€). Then {go(€ — y)}, ez is orthonormal by Lemma 5.1 because 
(5.4) has exactly one non-zero term. 

Inductively define functions ¢, by 


bi(E) = Ao(2é)G)-1(28)- (5.16) 


We claim that {y,(x — y)}, <7 is again orthonormal. This follows immediately from 
(5.13) with j = k = 0 and Lemma 5.1. It can also be deduced from 


ei(x) = Li ao) ¢j-1(2 — 7) (5.17) 


yEZz 


which is the non-Fourier transform version of (5.16), and (4.2). Note that 


X(-2!-1, 2-1 €) (5.18) 


a(e) = {TL Aa(2-%) 


so that ¢, > ¢ pointwise, by (5.10). 
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We would like to show 9; > ¢ in L’ norm. This will suffice to complete the 
proof, because the norm limit of orthonormal sets is an orthonormal set. This is 
the key point of the proof, where the non-vanishing hypothesis must be used. (As 
an interesting exercise, see how the argument breaks down for the counterexample 
given in §3.) 

By the Plancherel formula it suffices to show ¢, > @ in L’ norm, and since we 
have pointwise convergence we would like to use the dominated convergence 
theorem. Note first that ¢ € L? by Fatou’s theorem, since it is the pointwise limit 
of ¢, and ||¢,|l, = 1. Thus we can use a multiple of ¢ as a dominator. By 
comparing (5.18) and (5.10) we see 


oe) - 
6(é)={ GQ MNS? (5.19) 
0 otherwise. 


A 


We claim that ¢ is bounded from below on [—1/2,1/2]. The point is that ¢ is 
continuous, and by (5.15) A,(2-é) # 0 for |é| < 1/2. Thus ¢ doesn’t vanish on 
[—1/2, 1/2], so |¢,(€)| < clG(é)] for ¢ = (inf, 12, 1/y1E)7*. Q.E.D. 


§6. THE RECIPE. So now we have indicated all the major steps in the construc- 
tion, but we have left the first to last. We need to find actual solutions to the 
algebraic identities (5.8), (5.13) and (5.15). There are several different approaches 
to this problem. We describe one that is due to Ingrid Daubechies [D1]. 

We look for solutions with only a finite number of a,(¥) different from zero, 
which means A,(é) are trigonometric polynomials. This implies that the scaling 
function g¢ and wavelet %, have compact support. This can be seen most easily 
from the iteration procedure (3.7) and (3.8). Say a(y) = 0 unless y & [0, NJ]; then 
if f has support in [0, VN], so does Sf. 

We concentrate first on finding the function A,, which must satisfy three 
conditions: 


A,(0) = 1 (6.1) 
|Ao(é)[ +] 4o(€ + $)| = 1 (6.2) 
A,(é) #0 for |é| <j (6.3) 


(here (6.1) is (5.8), (6.2) is (5.13) for j = k = 0 and (6.3) is (5.15)). And, of course, 
A, must be of the form 


1 
A,(é) = ~ & aly)e?"”* = (finite sum). (6.4) 
2 vyEeZz 
Note that |A,(é)|* is then of the same form. 
Now we already know one solution, namely 


A,(€) = 7(1 + e77*) = e™® cos 7 


which yields the Haar wavelets. This was deemed unsatisfactory because the 
wavelets are not continuous. One way to create continuity and even differentiabil- 
ity is to take convolution powers, or on the Fourier transform side to take ordinary 
powers. Thus we are tempted to try A,(é) = (e7* cos wé)” for some large N. 
Unfortunately (6.2) no longer holds, but we can fix this up. Note that cos w(é + 
1/2) = —sin 7ré, so that is why |cos 7é|? + |cos w(é + 1/2)|? = 1. 
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Now take the identity cos” 7é + sin? wé = 1 and raise it to an odd power, say 
1 = (cos* w€ + sin? we) 
= cos!” wé + 5cos® wé sin? wé + 10cos® wé sin’ rE 
+ 10cos* wé sin® wé + 5 cos’ wé sin® wé + sin!’ wé. 
Take the first half of the terms for |A,|’, 
|A,(€)|° = cos! wé + 5cos® ré sin? ré + 10cos® résint ré. (6.5) 


Replacing € by € + 1/2 turns these into the second half of the terms, so (6.2) is 
automatic, and (6.1) and (6.3) are easy. This gives a recipe for producing Aol’, and 
it remains to take a square root of the form (6.4). We would also like to take the 
coefficients a,(y) in (6.4) to be real, for that will yield a real-valued scaling 
function (and in the end real-valued wavelets as well). There is a general theorem 
of F. Riesz that asserts that this is possible, but in this case it is easy enough to 
accomplish by trial and error. Since 


|A,(€)| = cos® ré(cos* ré + 5cos? wé sin? ré + 10sin* 7é) 
= cos° r€( (cos? wé — ¥10 sin? wt) + (5 + 2710 )cos? wé sin? wé} 
we can take 
A,(é) = (e7* cos mé) (cos wé — ¥10 sin? wé + iV5 + 2V10° cos 7é sin wé} 


1 3(1-—v10 -1+ 10 
g (ert + 1) + —__— 


5 ri (e277 + e72mix) 


+ ~ V5 + 210 (¢?*" _ an) (6.6) 


which is clearly of the form (6.4) with a,(y) real and a,(y) # 0 only if -1 < y < 4. 
To complete the story we need to find A,(é), also of the form (6.4), which 
satisfies 


|A(é)/ +] 4,(é+ Df =1 (6.7) 
and 
Aj(€)A,(€) + Ao(é + 2)A(E + 4) = 0 (6.8) 


(these are the remaining conditions of (5.13)). Fortunately, this can be accom- 
plished just by taking 


A,(é) = e’™*A,(E + 2) (6.9) 

which amounts to setting 
a(y) = (-1)”"'ao(1 — ¥). (6.10) 
Then (6.7) and (6.8) follow directly from (6.2) and the periodicity of A,. Note also 


that a,(y) are real valued if a,(y) are. 
The Fourier transform of w, is given by (5.11), which now reads 


(8) = AGE) T1402”) (6.11) 


with A, given by (6.6) and A, by (6.9). If we want to obtain the wavelet ys, itself 
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rather than its Fourier transform we first find 4, = ¢ by iterating the mapping 


Sf(x) = Dap(y)f(2x - y) (6.12) 
Y 
starting with any reasonable f satisfying (f(x) dx = 1, and then setting 
(x) = Da(y)e(2x - ). (6.13) 
Y 
See FicurEs 2 and 3. 
1.5 
1.0 
0.5 
0.0 
—0.5 
—4 —3 —2 ~1 0 1 


Figure 2. The graph of the scaling function gy, courtesy of David Aronstein. 


—2 
—3 2 —1 0 1 2 


Figure 3. The graph of the wavelet generator &,, courtesy of David Aronstein. 
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There is an alternative approach to constructing the scaling function that 
yields a different wavelet basis. It has the advantage of requiring less algebra, 
but the disadvantage of producing wavelets that are not compactly supported. 
Start with the Haar basis scaling function yj, ,;, whose Fourier transform is 
e™'§(sin wé/m€), and take the N-fold convolution product 


& = X0,1)* X0,1)* °° * Xo. (N factors) 


so that 


It is easy to see that g@C', but of course we have destroyed the 
orthonormality of translates by Z that y, ,, had. Too bad, but this is easily 
fixed. Write 


and observe that / is periodic and 
O<c, <h(€) Se, <™. 
Then we have only to take | 
o(€) = 8(E)/hA(E) 


and (5.4) is automatic, so we have the orthonormality of {g(x — y)}, <7. 
Notice that (0) = 1 and g(y) = 0 for y # 0 so G(0) = 1 as required. And it 
is not difficult to show that gp € C’~!. 
What about the scaling identity? Well, it certainly holds for g, namely 
§(€) = B(é/2) 8(€/2) 


where 
B(é) = (e7* cos we)” 
has the required form (6.4). It then follows that 
@(€) = Ao(€/2)6(E/2) 
where 
A,(€) = BCE)h(E) /h(2€). 


Now A, is periodic, so it must have the form (6.4), but the sum is no longer 
finite. This is where we lose the compact support of y. On the other hand A, 
is clearly smooth, so the Fourier coefficients in (6.4) must be rapidly decreas- 
ing, which implies that @ is rapidly decreasing. 

The construction of A,(é) and the wavelet Fourier transform be (é) then 
proceeds via (6.9) and (6.11) as before. 
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§7. SMOOTHNESS OF WAVELETS. How smooth are our wavelets? Since we 
understand them best on the Fourier transform side, we will use the principle that 
decay at infinity of ¢ implies smoothness of ¢ (we will establish smoothness of the 
scaling function and pass it on to the wavelets via (6.13)). For example, it is easy to 
show 


6(é)| sc(1 + lly (7.1) 
implies ¢ € C’™. So how do we establish (7.1)? 
We have the infinite product representation (5.10) which says 


G(€) = T14.2-*) (7.2) 


and A, is periodic. Since each factor does not decay at infinity, why should the 
product? This is a mystery, which is best solved by looking at the simplest case, 
A,(é) = cos wé. Then 


sin m7 é 
TE 


does decay at the rate O(|é|~*). (Formula (7.3) was proved by Euler, but special 
cases were known by Francois Viéte in the late 1500’s. You can prove it by 
considering the Fourier transform of Xt-1/2,1/2 and its scaling properties.) 

Clearly, for most choices of é, the values of cos 2~*zré will occasionally become 
small, and that makes the product (7.3) small. You might try to get around this by 
taking € = 2% for large N. Thus cos2~*wé = +1 for k = 1,..., N, so there is no 
decay, but then cos2~"~!wzé = 0 wipes you out. You can try to quantify this 
line of reasoning, but there is no great payoff in showing, for example, that 
sin wé/mé = O(\é|~7”°), so we will take (7.3) as our starting point. 

The expression (6.6) for Ay, or any of its more complicated cousins, contains 
cos7ré as a factor, many times. Thus ¢(€) contains sin 7wé/é as a factor many 
times, hence we expect decay. Unfortunately, the other factor grows. It is easier to 
work with |A gl? given by (6.5), if we remember to take the square root at the end. 
We have, for the special case considered, 


gt cos 27 “qé = (7.3) 


|A,(é) i = (cos wé)°(cos4 wé + 5cos* wé sin? wé + 10sin* 7é). 


The first factor produces decay O(|é|~°). The second factor can be written 
1+ 3sin’ 7é + 6sin‘* 7é so it clearly has a maximum value 10 at € = 1/2. We can 
obtain a crude estimate for the growth rate produced by the second factor by the 
following reasoning: if |é| ~ 2% then there will be about N factors where 2~*|€| is 
large, so an upper bound for the product is a constant times 10”. But 10% = |é|° 
for a = log10/log2 ~ 3.32. So the growth rate is at most O(|é 1°") so the 
combination gives O(|é|~7°8) for IS(E)|* hence O(/é| ~ 134) for o(é). 

This is a disappointing estimate. According to (7.1) it suffices only to show that 
gy is continuous. It can be improved, but not by a lot. To see why, consider 
€ = 2/3. Then for each of the N factors 2-*€ = 2"-*/3, 1 <k < N, we have 
1+ 3sin?2%~*/3 + 6sin* 2N-*ar/3 = 14+ 3- (V3 /2)* + O(¥3 /2)* = 6.625 so 
a lower bound for a is log 6.625/log2 which yields O(|é| ~ 1636) as the optimal 
improvement. 

If we consider the family of wavelets constructed as outlined in §6, we will have 
|A R¢3 written as the product of higher and higher powers of cos 7& by more and 
more complicated second factors. Thus we have faster decay times faster growth in 
@(é). Which wins? Well, it is a close race! It turns out that the decay wins, but the 
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Figure 4. The graph of ¢, after factoring out a power of sin 7x /mx, courtesy of Prem Janardhan and 
David Rosenblum. 


crude method of estimating the growth used above is not good enough to show 
this. The final result ((D1], [C2]) is that to create wavelets of class C’ we need to 
carry out the construction starting with (cos* wé + sin? 7é)“ = 1 for M on the 
order of 5(N + 1). This means that there is a rather high price to pay in terms of 
complexity (the algebra required to pass from |A,|* to Aj, for example) in order 
to gain a moderate amount of smoothness. (More recently, better techniques have 
been found to estimate the smoothness directly, without involving the Fourier 
transform [DL].) Figure 4 shows the graph of ¢(€). See [JRS] for a discussion of 
the surprising self-similarity properties of this function. 

In addition to smoothness, another important property of wavelets is the 
vanishing moment conditions 


[ xti(x) de = 0, k= 0,1,...,N (7.4) 


which are equivalent to the vanishing of the Fourier transform to high order at the 
origin, 


(£)°6,(0) =0, k=0,1,...,N. (7.5) 


In contrast to smoothness, however, it is only the wavelet, not the scaling function, 
which enjoys this property. The significance of this condition is that it implies a 
weak form of localization in the frequency (Fourier transform) variable, since the 
Fourier transform of s,(2’x — k) is mainly concentrated around values of || on 
the order of 2/. (There is yet another family of wavelets in which the Fourier 
transform is actually supported in an annular region c,2! <|él < C52). See [MI] 
for a description of these ‘“‘Littlewood-Paley” type wavelets.) For our wavelets the 
verification of (7.5) is easy. From (6.11) we see that , has a factor A,((1/2)é), 
and from (6.9) we see that A, at € = 0 has the same order zero as A, at € = 1/2. 
But A, has a factor of cos wé to a power, hence vanishes at € = 1/2 to order 3 in 
our particular example, and to order M if we start with (cos* wx + sin? wx)” = 1 
in our construction. Note that in general conditions (6.1) and (6.2) imply that 
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A,(1/2) = 0, and the flatter we make A, near € = 0, the more it vanishes near 


€=1/2. 


§8. CONCLUDING REMARKS. Why not try to create your own designer wavelets 
by programming the recipe given in §6, and taking the square root of |A,(é )/? ina 
different way? For a more detailed discussion of the Riesz Lemma for doing this 
see [D1]. 

For further information about wavelets, including historic accounts and attribu- 
tion of results, see the books [M], [BF], [BC] or the expository lectures [D2] and 
[FJW]. The term “wavelet” is also used to describe expansions in terms of 
functions which are not orthogonal. These wavelets have a simpler algebraic 
description, which is useful for some applications. An expanded version of this 
article, including a discussion of wavelet bases in several variables, will appear in 
[BF]. None of the theorems or proofs presented here are original; I have only tried 
to organize the material in a way that is easy to digest. 
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A Matrix Maximum 


William C. Waterhouse 


1. INTRODUCTION. In a recent paper [KZ], Kwong and Zettl described the 
solution to a maximization problem involving a 2 by 2 matrix A with real entries. 
They used a special way of normalizing the matrix, and it led them into computa- 
tions that could only be done as symbolic manipulations on a computer. I want to 
show: how a more geometric normalization will uncover a latent symmetry in the 
problem and thereby reduce the computation to comprehensible steps. 

To understand the normalization, we should begin with a simpler, more familiar 
question: what is the maximum of || Ax\|/||x\l, where ||x|| denotes length in the 
plane? The first thing to observe is that scaling x does not change the ratio, and 
thus we only have to consider its values for x on the unit circle. Now A clearly 
must map the unit circle x7 + x3 = 1 to an-ellipse (or straight line segment, if A is 
singular). Thus if the semi-axes of the ellipse are (say) kK and m, then the larger of 
those two is the maximum of ||Ax||/||x|l. 

We could turn this geometric analysis into a computation of the maximum, but 
it is more important to see how it leads to an expression for the structure of A (see 
Figure 1). Let R(@) be the matrix of rotation by angle @, so 


cos( ¢) wae) 
sin(d) cos(d) | 


>O°-Oor 7 


Figure 1. Structure of a Linear Mapping 


R(¢) = | 


The inverse of R(@) is of course R(—¢). Take one of the half-axes of the ellipse, 
say of length k, and suppose it makes an angle a with the positive x-axis. Then 
R(—a)A maps the unit circle to an ellipse with axes along the coordinate axes. 
The half-axis lengths are still k and m, and so we can multiply by a diagonal 
matrix to get diag(k, m)~'R(—a)A mapping the unit circle to itself. Hence it is an 
orthogonal mapping, either some rotation R() or diag(1, — 1) times R(B). Multi- 
plying through and absorbing any negative sign into'm, we obtain the following 
result, a variant of the “singular value decomposition.” 
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Theorem 0. Every 2 by 2 matrix A can be written in the form 


k 0 
A=R(a)(5 2) RC) 
for some angles a, B and some constants k,m. 


You can verify that this geometric proof [OG, p. 343-6] does indeed have a 
modification that works when A is singular. The same idea can be stated purely in 
terms of linear algebra: the first step is to observe that AA‘ is symmetric with 
positive eigenvalues, and hence it equals R(a)diag(k?,m?)R(—a) for suitable 
k,m,a. Then if you take B = R(a)diag(k, m)R(—a), you can easily verify that 
B~'A is orthogonal. (The theorem is actually valid in any number of variables, with 
special orthogonal matrices in place of rotations. See [H, p. 169] or [G, p. 286].) For 
our purposes, the advantage of this decomposition of A is that it incorporates 
information about the relation between ||Ax|| and ||x||. Observe that we have our 
choice of the order in which the diagonal entries occur, and so for nonzero A we 
can always suppose that k is nonzero. 


2. NORMALIZING THE PROBLEM. Fix now an invertible A. The problem 
solved by Kwong and Zettl is to find the maximum of 


2 
\| Axl 
lx {| - ||. A22x]| 


for nonzero x. As A is invertible, we can make a change of variable to replace x 
by Ax; thus it is equivalent to say that we want the maximum of 


ll? 
JAW *xl] + Axl 


Clearly the ratio is again homogeneous in x, so the maximum can be found on 
the unit circle. As Kwong and Zettl observed, we have 


|| Axl] = LR) Axil = IIR(d) AR(—¢)[R(¢) x] IL, 


and similarly for A*, so the maximum for A is the same as for R(¢)AR(—@). 
Furthermore, in this problem there is a homogeneity in A, so the maximum does 
not change if we multiply A by a nonzero scalar. Now the decomposition of the 
previous section shows us a nice way to simplify A using these operations: we can 
first conjugate by a rotation to cancel the factor R(a@), and then we can multiply by 
a scalar to make the first entry in the diagonal factor equal to 1. Thus we have a 
promising normalization: 


Theorem 1. Let A be any 2 by 2 matrix not identically zero. Multiplying by a scalar 
and conjugating by a rotation, we can reduce A to the form MR(@), where @ is some 
angle and M = diag(1,m) for some constant m. O 


We now take A to be of the form MR(@), and we let K(m, @) be the maximum 
aS x varies over the unit circle. To eliminate square roots, we can work with the 
square of the ratio and compute K 2(m, 0). Let t parametrize the unit circle by 
angle, so x(t) will have entries cos(t) and sin(t). For brevity, set 


Q(t) = ||Mx(t)|I? = cos?(t) + m? sin?(t) = m? + (1 — m”)cos*(t). (1) 
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Obviously R(@)x(t) = x(t + 86), so 
|| Ax(t)|/? = ||MR(0) x(t)" = O(t + 8). 
Similarly, we have 
Aq'x(t)I? = RC —0)M~'x(t)II? = (Mo! x(2)II? = O(a /2 — t) Jm?. 
Thus we have: 


Lemma 2. Let f(t) = m*/Q(t + 0)Q(w/2 — t). Then K*(m, 6) = max, f(t). O 


Before going on, it might be good to look at the graph of f(t) in a few examples. 
The two basic types are illustrated in Figures 2 and 3. All use the same angle 
6 = 7/4, so they also illustrate the change of behavior with m. Observe that there 
are certain values of t depending only on 6 (in this case, t = 7/8 and t = 7/2 + 


0.5 1 1.5 2 2.5 3 


Figure 2. f(t) with 6 = 7/4 and m = 1.2 (solid), m = 1.8 (dashed) 


0.5 ; 
‘ 


0.25 , ‘ 
di SA 


Figure 3. f(t) with 6 = 7/4 and m = 3 (solid), m = 5 (dashed) 


1993] A MATRIX MAXIMUM 


559 


a /8) that always give local extrema. For large m, however, the absolute maximum 
occurs elsewhere and has a value independent of m. We shall show that these 
properties are true in general. 


3. THE LATENT SYMMETRY. It is clear from the original homogeneity that 
f(t + 7) = f(t). But the. expression for f in terms of Q shows that there is also a 
latent symmetry. To bring it out, we set p = (7/2 + 6)/2 and u =t¢t — (7/2 - 
6) /2; then we get f(t) = m?/g(u) with 


g(u) = Op + u)QAp — uv). (2) 


When we expand, we get 
Q(p + u) =m* + (1 — m’)[cos( p)cos(u) — sin( p)sin(u)]° 
= m? + (1 — m”)|cos?( p)cos?(u) + (1 — cos?( p))(1 — cos?(u))| 


— 2(1 — m’)cos( p)cos(u)sin( p)sin(u). (3) 
The symmetry now lets us observe that O(p — u) is the same as O(p + u) except 
for the sign of the last term, which will be reversed. Thus the product g(u) will be 
the difference of the squares. We can see at. once that sines and cosines will occur 
in g only as squares, and thus we are going. to be able to use cosines alone; 
furthermore, we see that the result will be quadratic in the variable W = cos*(u). 
To find it explicitly, we have to do some straightforward computation. Of course 
you can save time and avoid mistakes by doing it on a computer, but it can 
certainly be handled by hand. Here is the result. 


Lemma 3. Set W = cos?(u). Then g(u) = G(W) where 
G(W) = (1-—m?)w? +241 - m*)|m? cos?( p) + cos?( p) — 1|W 
+(1—m?)’ cos4( p) — 2(1 — m?)cos*(p) +1. O 


4. COMPUTATION OF THE CRITICAL VALUES. Our G(W) is identically 1 if 
m? = 1; otherwise it is quadratic. The square term is positive, and hence the 
unique extreme value of G will be its absolute minimum. There is basically no 
difficulty in finding its extremum; again the computation takes a little work, but it 
is not hard. The appearance of the factor (1 — m”) to appropriate powers in the 
coefficients helps make the result particularly nice: 


Lemma 4. For m* # 1, the function G(W) has a unique extremum (a minimum) at 
the point 


1 — cos*( p) — m? cos?( p) 


W (4) 


1 — m? 
The value at that point is 4m* cos*(p)Q — cos*(p)). O 
Now we can begin to translate this result back into our original variables. We 
had g(u) = G(cos’(u)), and hence we have 
g'(u) = —2cos(u)sin(u)G’(cos?(u)). 


Thus the critical points of g(u) occur at points where cos7(z) is either 0 or 1 or the 
W in (4) where G has its minimum. When cos’(u) is either 0 or 1, we can see that 
the last term in @) is zero, and so we can evaluate g(u) = O(p + u)Q(p — u) 
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directly; we get the values 
[m? + (1—m?)cos*(p)|° and [m?+(1- m?)sin2( p) |”. 
We have p = (7/2 + 6)/2, and hence 
2 cos*( p) — 1 = cos(2p) = cos(7/2 + 6) = —sin(@). 


Thus cos*( p) = (1 — sin(@))/2, and similarly sin?( p) = (1 + sin(@))/2. The values 
at the first two types of critical points then come out to be 


m? + 1+ (1 — m’)sin(6) ° 
rr ns ©) 


The value at the minimum of G is even simpler; it comes out to be just m7 cos?(@). 
Of course it is possible that the G-minimum occurs at a point that cannot be a 
value of cos*(u). The reduction from p to 6 shows that this minimum occurs when 


1 — m? + (1 + m’)sin(6) 
2(1 — m?) 


Hence we need to determine when this value is between O and 1. That is just a 
two-line computation, yielding the condition 


1+m?)\- 
[= fain) 


Thus we have finished our computations, which we can summarize quite briefly: 


W = 


<1. (6) 


Theorem 5. The function g(u) has the values (5) at the critical points where u is a 
multiple of w/2. If (6) is false, these are the only critical points; but if (6) is true, g 
also has an absolute minimum with value m? cos?(6). 0 


5. THE MAIN THEOREM. If we multiply by the denominator, we can state the 
basic inequality in a way that makes sense for singular matrices, and (like [KZ]) we 
include that in our final version of the result. 


Theorem 6. Let A be a nonzero 2 by 2 matrix, and let scaling and conjugation by 
rotations reduce A to the form MR(@), where M = diag(1,m). Let K denote the 
smallest constant (if any) that makes | Ax||? < K - [xl] - || Axl] true for every vector 


x. If 
1+m?)\ | 
| low inc 


then K = 1/|cos(@)|. When m = 0 and cos(@) = 0, the inequality does not hold with 
any constant. In all other cases, K is the larger of the two values 


2\m| 
m? + 1+ (1 — m7*)sin(6) © 


< 1, 


Proof: When m # 0, this theorem follows at once from (2), Lemma 2, and 
Theorem 5. (The case m” = 1 was excluded in Section 4, but the theorem gives the 
correct answer in that case, as (6) is then not true.) For m = 0, we can hold 6 fixed 
and let m approach 0; as we are working with the compact set of vectors of norm 
1, the constant K in the limiting case will be the limit of the approximating ones. If 
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sin(@) # +1, then the expression in (6) is less than 1 for all m close to 0; the 
extreme is thus equal to 1/|cos(@)|, as the theorem says. If sin(@) = +1, then we 
are in the last case for all small m; the maximum there is 1/|m|, which goes to 
infinity as m goesto0. O 


The case where no bound exists corresponds to the nilpotent normalized matrix 


0 +1)\)_/1 0 0 +1 

0 O 0 O/;\ +1 0; 
In the other cases, we know the u giving the maximum, and so we could trace back 
the normalizations to compute the angle of maximum for the original A. 


6. RELATION TO THE EARLIER TREATMENT. In [KZ], the matrices were 
normalized to have their diagonal entries both equal to 1 or both equal to 0. 
Consequently, the results there look rather different, and I want to conclude by 
displaying the connection with the formulas derived here. First we rotate: if we set 
w = (7/2 — 0)/2, then it is easy to check that 


m+ cos( 6) ae sin(@) 
R(—W)MR(O) ROH) = |g 
_ + sin(@) cos(@) 


To avoid special cases and sign distinctions, let us suppose that m > 1 and both 
sin(@) and cos(@) are positive. We can then scale to get 


with 
A 1 m-—1 a(G | d 1 m—1 
~ cos(@) \m + 1 sin(@)} and ¢ = mn | 
Using the notation of [KZ], we introduce h =c — b and r = (1 + h?/4)'””, 
Their assertion is that K for this matrix is |1 — bc|/(1 + b*) except when b is 
between 2(1 — r)/h and 2(1 + r)/h, in which case K = r. It is easy to check that 
(in our notation) h = 2tan(@) and r = 1/cos(@), while |1 — bc|/(1 + b7) comes 
out to be 


2m 
m? + 1+ (1 —m?)sin(6) 


Readers might find it a pleasant exercise to check that the betweenness condition 
on b is equivalent to our condition (6). 
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Chaotic Motion of a Pendulum 
with Oscillatory Forcing 


S. P. Hastings and J. B. McLeod 


I, INTRODUCTION. The mathematical theory of “chaos” has grown rapidly in 
the last twenty years, with one landmark being the 1975 paper [9] of Y. Li and 
J. Yorke which appeared in this journal. Indeed, we understand that this paper 
included the first use of the word chaos in the context of dynamical systems. The 
subject dominates dynamical systems theory today, in the literature of both 
mathematics and physics. This is despite a lack of unanimity on what sorts of 
behavior should be called chaotic, or any firm definition of associated concepts, 
such as “strange attractor,” or “sensitivity to initial conditions.” 

The Li-Yorke paper, which turned out to be a rediscovery of some of the results 
of the Soviet mathematician A. N. Sharkovsky eleven years earlier [3, 13], dealt 
with iterations of maps of an interval into itself. Even today, chaos theory is far 
more developed in the case of maps, in one or two dimensions particularly, than it 
is for smooth dynamical systems, such as differential equations. This is largely 
because differential equations are so much harder. For example, the famous set of 
differential equations found by E. N. Lorenz [10] in the context of meteorological 
investigations is still not understood very well, since there are no proofs of chaotic 
behavior. 

There is, nevertheless, considerable theory for the case of ordinary differential 
equations (much less for partial equations) and several monographs are available 
expounding this theory. One of the best known is the book [6] by J. Guckenheimer 
and P. Holmes. There we learn again that the Soviets were ahead, since there is 
extensive discussion of the theories of V. K. Melnikov, from 1963 [11], and L. P. 
Shil’nikov, from 1968 [14]. The techniques of both of these pioneers were designed 
to reduce the study of certain systems of differential equations to the study of 
finite-dimensional maps. They show that imbedded in the phase space for these 
systems one can find a “horseshoe” map, which is a creation of S. Smale in the 
1960s [15], and which enables one to show, for example, that the system in question 
has infinitely many periodic solutions. Not all workers accept this as a criterion for 
chaos, but it is the focus of much work, and in most cases, all that has been proved 
for smooth systems of differential equations. 

The book by Guckenheimer and Holmes, like other literature in this field, is not 
easy reading. The theory was, and remains, incomplete and this is reflected in the 
unfinished nature of many of-the results. For example, the following remark from 
their discussion of the concept of strange attractor is probably still valid. 


“In trying to piece together a coherent picture of this situation, we enter a 
realm in which the theory remains in an unsatisfactory state. There are 
paradoxes in which different theorems appear to be steering us toward 
opposite conclusions.”... 
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We do not propose to resolve such problems here, or even to give an exposition 
of these fascinating concepts. Rather, we concentrate on the intuitive idea that a 
differential equation exhibits chaos if it has many solutions which are bounded and 
which are erratic and unpredictable in some sense. Within this limited scope there 
are a number of rigorous results, for smooth differential equations as well as for 
maps. 

Generally these have been obtained by methods, like those of Melnikov and 
Shil’nikov, which in some way reduce the study of the differential equation to the 
study of a related map in a lower dimensional space. For example, if we are 
studying an autonomous system of three first-order ordinary differential equations, 
the related map is two-dimensional, taking some subset of the plane into itself. 
The goal of this paper is to show that results of this sort are accessible by different, 
and we think simpler, methods, which involve study of solutions of the differential 
equation directly rather than through a related map. We do this for a particular 
example, the equation for a pendulum with oscillating support, for illustration. The 
same technique can be applied to other equations, and some examples are given in 
[7] and [8]. 


Il. EQUATION OF MOTION FOR A PENDULUM. First consider a simple 
pendulum, consisting of a mass m at the end of a massless rod of length /, which 
pivots on a frictionless support that forces the pendulum to move in a vertical 
plane (FicurE 1). 


Figure 1. We consider a pendulum free to rotate in a full circle. 


At time 7 the rod makes an angle u(r) with the vertical. The forces to be 
considered are the gravitational force mg and the damping due to air resistance as 
the pendulum swings. This is assumed to be proportional to the angular velocity 
(du /dr) = u(r). Applying Newton’s law of motion, we obtain the ordinary differ- 
ential equation 


mii + cu + mg sinu = 0, 


where c is the positive constant of proportionality. We immediately rescale to 
get the dimensionless version, by setting t = ¥g/lr and y(t) = u(r). Letting 
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(dy /dt) = y’, we get 
y” + ky’ + siny = 0, (1) 
where k =c/mygl. 
The phase plane obtained by plotting y’(t) against y(t) is well known, and can 
be found in many introductory texts on ordinary differential equations, such as [2]. 
In FIGURE 2 we show the cases k = 0 and 0 < k < 2. 


(a) (b) 


Figure 2. Two phase planes for the pendulum, one undamped, the other damped. 


When k = 0 (no damping) the equation is called “conservative,” because there 
is a function of the solution which is conserved, or, in other words, is constant as ¢ 
varies. This is the so-called “energy” function associated with the pendulum. If 
(y(-), y’(-)) is a solution, the associated energy is 
y'(t)" 

2 
and is the usual energy function from physics, being a sum of the kinetic energy 
y'(t)*/2 and a potential energy term —cos y(t), which is at a minimum when the 
pendulum is at its stable rest point, y = 0. To see that EF is constant in the absence 
of damping, differentiate the expression for E and use (1) with k = 0. 

FIGURE 2 illustrates conservation of energy when k = 0 because there is a 
family of orbits which are closed smooth curves, and represent periodic solutions. 
These are solutions with low energy. On the other hand, if the initial velocity is 
large, so that E(t) is large, then the trajectory in phase space is unbounded, and 
represents a pendulum which continues to rotate in the same direction, making 
repeated complete rotations without loss of energy. It is important to observe that 
there are intermediate trajectories, such as the one which tends to the point 
(—7,0) in phase space as t ~ —», and to (7,0) as t > ~. 

A trajectory in the phase plane of an autonomous differential equation repre- 
sents many solutions, which can be characterized as passing through the same 
point in phase space but at different times. Corresponding to the trajectory 
connecting (—7,0) to (7,0) as -t increases, which exists when k = 0, there is a 
unique solution y, of (1) such that 


y,(0) = 0, lim y(t) =7, jim yo(t) = —7. (2) 


E(t) = — cos y(t), 


Physically, this solution is approximated by a pendulum which starts from rest very 
close to the upright vertical position and makes almost a complete rotation, coming 
again close to the vertical position. 
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Chaotic motion is not possible in this model, whatever the value of k. To obtain 
more erratic solutions we must add some sort of forcing term. This can take 
various forms, but the one we study here results from assuming that the support of 
the pendulum is subject to a vertical motion, up and down, which is sinusoidal. 
This adds a force proportional to sin et to the gravitational force. We make the 
assumption that the force on the support varies slowly with time, and also that the 
damping force is small. This results in the equation 


y” + dy’ + (1+ ysinet)sin y = 0 (3) 


where 6 and y are fixed positive numbers and e is positive but small. To avoid the 
delicate case where the coefficient of sin y can be zero, we require that 0 < |y| < 1. 
Equation (3) was-studied by S. Wiggins, in [16]. 


Il. PREVIOUS RESULTS. To describe Wiggins’ result, suppose that y >0 
represents displacement from rest in a counter-clockwise direction, and a full 
rotation occurs each time y(t) crosses an odd multiple of 7. He then shows that 
there is a A(y) > 0 such that the irregular behavior can occur if 0 < 6 < A(y) and 
e is sufficiently small. We shall describe the function A(-) shortly, but first let us 
specify the nature of this “irregular” behavior. Our measure of irregularity is that 
the pendulum makes a sequence of full rotations, alternating between clockwise 
and counter-clockwise rotations in an erratic manner. More precisely (without, 
however, yet specifying A(y)), we state this as a theorem. 


Theorem 1 (Wiggins [16]). Suppose that 5 and y are given nunibers, with 0 < |y| < 
1 and 0 <6 < A(y). Then there is an €, > 0 such that for any € with 0 <e<e,, 
and any finite or infinite sequence {m,},_1.3 ... of positive integers, there is a 
solution of (3) such that the corresponding motion consists of exactly m, full 
clockwise rotations, followed by exactly m, full counter-clockwise rotations, m, full 
clockwise rotations, and so forth. If the sequence is finite, then eventually the 
pendulum stops making full rotations. 


It is common to refer to the solutions corresponding to infinite non-repeating 
sequences as “chaotic,” though, as we said, some researchers prefer a stricter 
interpretation of this term. 

Wiggins obtains this striking result by applying the technique of Melnikov to the 
equation (3). This requires extending Melnikov’s original method, because previ- 
ously the forcing term was required to be “small” in amplitude, whereas here the 
small parameter measures the frequency of the oscillation, not its amplitude. The 
necessary extension was also given by Palmer [12]. 

To define the function A(-), Wiggins derives the appropriate “Melnikov” 
function for (3). While this concept has always seemed slightly mysterious to us, 
here it results from very standard energy methods involving the function E(t). (See 
below.) Suppose that y, is the unique solution to (1) with k = 0 satisfying (2). 
Then 


= oSVo(S)SIN Yo(s) ds 


A(y) =ly 
frn¥o(s)’ ds 
In other words, chaotic solutions exist for sufficiently small e€ > 0 if 
f I, (s) ds > 0, (4) 
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where 
I(s) = —8y'(s)” + lylsy’(s)sin y(s). (5) 


The left side of (4) is the Melnikov function for the equation (3). Since y, satisfies 
(1) with k = 0, we can set sin y,(s) = —yo(s) in the formula for A(y), integrate by 
parts, and use the boundary conditions to obtain that A(y) = $ly\. 


IV. PROOF WHEN 6 = 0. We will show how these results, and others, can be 
obtained by techniques which we feel are simpler than those used previously. 
Instead of studying Poincaré maps, we follow the solutions more directly, to 
determine how they vary as the initial conditions change. We need consider only 
initial conditions representing a pendulum which is released from a raised posi- 
tion, with zero initial velocity. The case 6 =0 in (3) is particularly simple. 
Therefore we consider solutions to 


y+ (1+ ysinet)sin y = 0, (6) 
with initial conditions 
y(0)=a, y"(0) =0. (7) 
Sometimes we will denote the solution by y,. The goal is to obtain complicated 
solutions by adjusting a. This is sometimes called a ‘‘shooting method,” because 
we attempt to “aim” the solution to get the desired behavior. 
Shooting methods are topological, relying on separation theorems of some sort 


to distinguish between various types of behavior. As an example we prove a simple 
result about (6)—(7) which we will need later. 


Lemma 1. For sufficiently small « > 0 there is an & € (—7,0) such that if y = ya, 
then y’ > 0 on (0,7 /2€] and y(m/2e€) = 0. There is also an a& such that y' > 0 on 
(0, 7 /e] and y(1r/e) = 0. 


Proof: We show how to get a; the argument for a is the same. If —7 < y < 0, 
then y” > 0, and so by choosing a in this range we ensure that y’ > 0 as long as 
y < 0. Clearly, y crosses 0. Let 


7 
A= fc € (—7,0)| y(t) = 0 before ¢t = =} 
E 
and 
7 
B = fe € (—7,0)| y(t) = Oafter ¢ = =}: 
E 


Note that if y(0) = —7, y’(O) = 0, then y is constant, so that if a is very close to 
a, then y remains close to —7 for a long time before crossing 0. This shows that 
B is non-empty. To show that A is non-empty, consider a small negative a. As 
long as y is small, solutions of (6) are approximated by solutions of the linear 
equation u” + (1 + y sin et)u = 0. Solutions of this equation oscillate more quickly 
than solutions of v” + av = 0 where o = 1 — |y\, and so cross zero in the interval 
(0, / vo). Hence if 2e < Vo, then small negative a’s lie in B. 

The crossings of zero are with y’ > 0, and so the continuity of solutions with 
respect to a implies that A and B are open sets. They are obviously disjoint, and 
the connectedness of the interval (—77,0) implies that there is a point in this 
interval which is not in A or B. Such an a gives the solution y, described in the 
lemma. 
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Note that we make no assertion about whether @ > a or vice versa, nor will we 
need any such result. Comparisons of this kind are generally difficult to obtain. 
They are used in uniqueness proofs, but this is a paper about existence. 

A crucial fact about solutions of (3) is that if a = ka for some integer k, then 
the solution is constant. In fact a stronger statement is true: 

(i) If, for some ty, y(t)) = ka and y'(t,) = 0, then y(t) = ka for all t. 

This follows from the uniqueness theorem for initial value problems for ordi- 
nary differential equations. The solution y = k7 is the unique solution satisfying 
the conditions y = k7r, y’ = 0 at the point ¢ = fp. 

One reason that the case 6 = 0 is particularly simple is that in this case some 
solutions have certain symmetries, around points T,,/€, where n is an integer and 
T,, = (2n + 1)ar/2. These are as follows. 

(ii) If y(T,,/e) = ka for some integer k, then 


T,, T,, 
»{= +s] ~ k= kr ~ y= - 5}, 
E E 


for all s. 
(iii) If y’(7,/e) = 0, then 


for all s. 

These are also proved by using the uniqueness of solutions to initial value 
problems. Note that the solution y, found in Lemma 1 is antisymmetric around 
a /2e, and therefore it increases up to 7 /e, where it has a maximum in the region 
0 < y < w and then starts to decrease. 

The only detailed analysis required to prove Theorem 1 when 6 = 0 Is used to 
obtain the following, lemma. 


Lemma 2. For sufficiently small « > 0, there is some a with —amw <a <0, such 
that y,, increases monotonically on some interval [0, ty] and y,(to) = 7. 


The proof of Lemma 2 is quite simple in the situation we are considering. We 
give an outline below. Here we want to show how it is used in conjunction with a 
shooting technique to prove Theorem 1. 


Figure 3. A graph of a solution behaving as described in Lemma 2. 
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Suppose that the first complete rotation of the pendulum is to be in the 
direction of positive y; i.e. we want the solution to cross y = 7 before any possible 
crossing of y = —7r. We begin by choosing a = a as in Lemma 2. The correspond- 
ing solution crosses y = 7 at a point ¢t), with y’ > 0, and since y” > 0 when 
aw <y < 27, there must be a ¢t, > ¢, such that y’ > 0 on (0,¢,], and y(t,) = 27. 
Since y'(t,) > 0, the implicit function theorem implies that there is a smooth 
function ¢,(a), defined in a neighborhood of @ by the equation y,(t,(a)) = 27. 

However, recalling the properties of y in Lemma 1 when a = a, we see that 
t (@) is not defined. Suppose for convenience that & > a. Then the function ¢,(-) 
is continuous in some maximal interval of the form [a, a), where a < ad < 0. Since 
t,(-) can be extended continuously to an open neighborhood of any point where it 
is defined, we conclude that 

lim ¢,(@) =. (8) 
a->a— 
This result depends on property (i) above, for otherwise a crossing of 277 might 
disappear by the solution becoming tangent to this line for some a. 

To illustrate the idea of the proof, suppose that m, = 2, so that we want y to 
cross 37 before recrossing zr. This is accomplished by moving a from a towards 
a. Since ¢t,(-) is continuous, it follows from (8) that there is an a, € (a, @) such 
that t,(a,) = T,,,/€, for some integer n,. We now apply the anti-symmetry princi- 
ple (ii). Since y’(0) = 0, —a < y(O) < 0, and y increases monotonically to reach 
27r at T,,,/€ it must continue to increase monotonically until it crosses 37 and 477, 
after which it has a maximum before any possible crossing of 577. We have then 
accomplished the first step of achieving exactly two counter-clockwise rotations, 
and we next wish to obtain a clockwise rotation, since we are assuming that 
mM, = 1. 

Let t,(a,) be the first t > 0 where y, has a local maximum. As we have noted, 
4a < y,(t.(a,)) < Sa. Then ¢,(-) can be extended as a continuous function in 
some neighborhood of a,, as a solution of the equation y’(t) = 0. As long as ¢,(-) 
is continuous, y,(t,(@)) must lie in the interval (477, 57), since (i) implies that the 
maximum cannot leave this interval by means of a tangency. Moreover, f,(-) can 
be extended continuously to a maximum interval of the form [a,,@,), where 
a, <a, <a, and 

lim t,(a) = ®. 
a> a, — 

Therefore, we can find a, € (a,,@,) such that t,(a,) = T,,,/e, for some integer 
n,. It is important to note that t,(@,) is not of the form T,, /e, but it is still defined, 
as the first point where y,, crosses 27. By (iii), y,, is symmetric around t(a,), 
and so it must descend from its maximum there to recross 37 and 7. If m, = 2 
then this completes the second step of the induction process, since with the choice 
we have made of a, the next extremum of y,, is a minimum at ¢ = 2t,(a,), 
where y = y(0) € (—77, 0). If m, # 2, we must adjust a further. As we adjust a, 
the various crossing points defined so far will change. However at each adjustment 
we only move a enough to bring the most recently defined crossing point or critical 
point to one of the points of symmetry or antisymmetry on the ¢ axis. All the 
earlier such points remain bounded, and hence continuous in a, and none of the 
critical points can cross any line y = k7. 

The case m, = 5 is typical. So far we have a solution which increases from its 
starting point in the interval (—7r,0) past 7 and 377, and then decreases back past 
0. We can increase a still further so that this (downward) crossing of 0 is at one of 
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the points T, /e. Then the antisymmetry property (ii) shows that the solution must 
continue to decrease past —7 and —3z7, but not past —577. 

This means the pendulum makes four counter-clockwise rotations, and we want 
exactly five. This is achieved by a further increase of a so that the point of 
antisymmetry is where the solution crosses —7r. Since this is half-way between 37 
and —577, and the solution has a maximum in the earlier interval where it was 
between 47 and 57, it will now have a minimum between —67 and —7zm. It 
crosses —57r, but not —77r, which completes the second step. 

Continuing this process, we obtain an increasing sequence of a’s which is 
bounded above by a. This sequence must have a limit lying in the interval (—77, 0), 
and this limiting value of a gives the solution we were after. Note that at each 
step, when we adjust a@ so that some crossing or extremal point lies at some odd 
multiple of a7 /2e, there are infinitely many choices of a, since there are infinitely 
many such odd multiples. This gives, for each sequence {m,}, an infinite number of 
solutions of the desired type. 


V. CHAOS AT PARTICULAR PARAMETER VALUES. It is apparent from the 
proof of Theorem 1 that, for 6 = 0, chaotic solutions exist if there is a solution 
such that —7a < y(0) < 0, y’(O) = 0, and y(7) = mw for some T > 0. For particular 
parameter values this can be verified by following only one trajectory for a finite 
time interval. Therefore, in principle, rigorous estimates can be made which allow 
a proof that chaotic solutions exist for precise values of the parameters, rather 
than “for sufficiently small €,” as in Theorem 1. First we -have to locate a good 
candidate for the parameter values. Standard numerical experimentation can easily 
be done on a personal computer, for example with the software Phsplan [4]. It is 
quickly determined that for « = 0.1, y = —0.5, the solution with y(0) = —2.7, 
y'(0) = 0 increases monotonically until it crosses 77, at approximately ¢ = 6.1. 

We then make use of a technique in numerical analysis called “interval 
arithmetic.” In interval arithmetic, the computer is programmed to include error 
estimates in all arithmetic computations. An exposition is given in [1]. In addition 
to roundoff error it is necessary to allow for the truncation error introduced by the 
numerical method. Programs can be written to do this. We used PBASIC [1], 
which uses precise interval arithmetic, to do a completely rigorous integration of 
the equation with the parameter values and initial conditions found approximately 
using standard floating point computations. The result confirmed that the solution 
does indeed cross 7, and therefore that with these parameter values there are 
solutions of arbitrary complexity, as described in the theorem. 


VI. EXTENSIONS. The outline above of the proof of Theorem 1 makes it appear 
that symmetry is crucial for this result. But this is misleading. Symmetry considera- 
tions shorten the proof, but the heart of our method is the topological shooting 
principle. This is fortunate, for as soon as we add a damping term, as in (3), 
properties (ii) and (iii) do not hold. Shooting works because (i) is still valid. We 
need a slight extension of Lemma 2, but this is not difficult. 

Physical intuition may cause doubts about this, because damping reduces energy 
and tends to stop the complete rotations. However the oscillation of the support 
can add energy if it is timed correctly. It is rather like pushing a child’s swing. If 
the pushes occur in the direction of motion, the swing will go higher, despite air 
resistance. If the resisting forces are not too high we can indeed push a swing over 
the top, (though perhaps the child will be better off if we do not). Do not carry this 
analogy too far, however. A very important difference is that in our case the 
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motion of the support is determined ahead of time, and not adjusted to fit the 
motion of the pendulum. 

In fact, even the periodicity of the forcing is not required. We can consider 
more general equations 

y" + edy’ + p(et)sin y = 0 (9) 
where the positive smooth function p increases and decreases in some fashion. For 
example, here is a set of sufficient conditions on p. 

(a) p, 1/p, p’ and p” exist and are bounded on [0, ©). 

(b) There are sequences {t,} and {7,} tending to infinity and a c > 0 such that 
p'(t,) = and p'(7,) < —c. 

This includes almost periodic functions and many others, and goes beyond what 
has been found using Poincaré maps, for it is difficult to define such a map usefully 
when the equation depends explicitly on time in an irregular fashion. 

The damping term edy’ in (9) is responsible for the introduction of the 
Melnikov function as described earlier. This enters into the proof of Lemma 2 
when 6 > 0, but plays no role in the shooting part of the argument. We conclude 
this paper with a brief discussion of how to prove Lemma 2. 


VII. OUTLINE OF PROOF OF LEMMA 2. The basic idea is to consider the 
energy E(t), as defined earlier. The damping term tends to reduce E (makes 
E' < Owhen y’ # 0), while the oscillation in the support may increase or decrease 
FE, depending on the relative direction of movement of the support and the 
pendulum at any given time. Suppose again that 6 = 0 and p(s) = 1+ ysins. We 
find using (6) that E’(t) = —yy’'(t)sin et sin y(t). To prove Lemma 2, we show 
that the initial position a can be adjusted so that E(t) increases enough during the 
first swing to get the pendulum “over the top” for one complete revolution. 

Suppose that y > 0. Then we can choose a = a@, which was found in Lemma 1. 
In this case we have sin et > 0 and sin y < 0 in the interval (0, 7 /e), and both of 
these quantities change sign at this point. It follows that E increases on the entire 
interval (0, 27/e), as long as y’ > 0, y < wr. 

We can extimate the change in EF while y’ > 0, y < 7 as follows. If w/e <t < 
27 /e, then 


E(t) = E(0) + ik —vyy’'(s)sin es sin y(s)} ds 


> E(0) + cr —yy'(s)sin es sin y(s)} ds. (10) 


The second term on the right is estimated by proving a simple lemma showing that 
as « — 0, the solution y found in Lemma 1 must tend to y,(t — m/e), where yp is 
the unique solution of (1)-(2) with k = 0. That is, for example, 


Hy -nfi- 2] 


Setting s — m/e =o, and noting that sineo — eo uniformly on compact 
o-intervals, we find that the second term on the right of (10) is asymptotically like 
{° vyeoyo(o)sin y,(o) do = we where wu is a positive number. We also need an 
easy estimate for E(0) = —cos a. Without going into further details, this enables 
us to obtain quickly an estimate of the form y’(t) => kVve for t> qm /e, implying 
that y rises above 7 before t = 277/e. 

When 64 is positive, the rate of change of energy is given by E’(t) = —de y”? — 
yy’ sin et sin y. The first term causes E to decrease, while the sign of the second 


max — 0. 


aw/e—l1<t<7/e 
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term alternates. The analysis of the net change of E in the same situation as 
above, where y = 0 when ¢ = 77/e, is a little more complicated and leads to a 
consideration of the Melnikov function. The case of a non-periodic forcing is no 
harder; periodicity is irrelevant to Lemma 2. 


VII. FINAL REMARKS. Theorem 1 and the extensions described in the last 
section by no means tell the whole story, even for (6). While we show that there is 
an uncountable number of erratic solutions, our numerical simulations indicate 
that most solutions eventually settle down into small oscillations around an even 
multiple of 7, which represents the downward vertical position of the pendulum. 
We do not know of a proof of this, however. 

Also, we do not know whether the results can be extended to include larger 
values of e. Our standard numerical computations (not rigorous) show that the 
crucial solution of Lemma 2 exists at least out to e = 50. For very large e it seems 
that another subtle effect enters in, the so-called “exponential splitting of separa- 
trices” [5]. This is certainly beyond the scope of this paper. What we would like to 
show is that the phenomenon occurs for all values of €, and this seems to require 
some new estimates. 
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An Application for the Curiosity (log, N)’ 


David A. Wagstaff, Theodore A. Norman, 
and Douglas M. Campbell 


Students often believe that differentiation formulas such as 
(*) [log, NT = —(log, N)/(x In x) 
are mere curiosities. We present a practical application of (*). 

In practice, unsorted data files on a hard disk may be extremely large (e.g. 
40 megabytes), while available RAM (Random Access Memory) on many personal 
computers is small (e.g. 1 megabyte). There is a simple strategy to sort such a file: 


1. Divide it into chunks which are the size of RAM (for our example, 1 megabyte 
chunks). For each chunk, read the chunk into RAM, sort it by one’s favorite 
internal sort, and write the chunk back to the hard disk as a separate file (see [1], 
p. 263—p. 270; [2] chapter 5.4, Theorem L, page 371). 


2. Then, as FiGure 1 indicates, groups of x of these sorted chunks are merged 
into a larger sorted chunk, and written back to the hard disk. This continues until 
the final merge, in which only x huge chunks remain and they are merged into the 
final sorted file. 


Although ‘this example is a considerable simplification of the real problem, the 
most time consuming part of this operation is the seek, in which the access 
mechanism of the hard disk is moved to the proper track on the hard disk to read 
the information. The question arises, what value of x will minimize the number of 
seeks? Furthermore, how does the value of x depend on the file size and the 
available RAM? 


Theorem. Let a file have N records and let the computer’s RAM hold M records. 
The value of x to minimize the number of seeks in an x-way merge is 3, independent 
of N and M. 


Proof: Sub-divide the N records into R = N/M files (chunks), which are read into 
memory, internally sorted, and written to the hard disk to form the top row of 
Figure 1. (By adding specially marked dummy records, we may assume that R is a 
power of x.) The system of x-way merges of Figure 1 has log, R levels. At each 
level, the contents of the original file have to be read into x buffers. Since RAM 
can only hold M records, each of the x buffers holds M/x records. Each time a 
buffer is filled, a seek is required. Therefore, the number of seeks per level is 
N/(M/x) which is xR. The total number of seeks, y, is the number of seeks per 
level times the number of levels: 
y = xRlog, R. 


Taking the derivative with respect to x we see that y’ = Rlog, R[-1 + 
In x]/In x, which is zero when x = e. Since y’ goes from negative to positive at 
x =e, the function y has its minimum at x = e, independent of N and M. But the 
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value of x in an x-way merge must be an integer. Writing R as 2’, for some z > 0, 


we note that 
y(2) — y(3) = 2R log, R — 3Rlog, R 
= R(2z — 3z log, 2) 
= Rz(log,9 — log; 8), 


which confirms that a 3-way merge minimizes the number of seeks. 


To derive (*), we note that y = log, N can be rewritten as x” = N and thereby 
as ylnx =In WN. Taking the derivative yields y’ In x + y/x = 0, which when 


solved for y’ yields (*). 
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Vandermonde Strikes Again 


Miriam Schapiro Grosof and Geraldine Taiani 


The search for “cute” proofs—those in which arguments or techniques that at first 
glance appear completely unrelated in substance are used to establish familiar 
results—provides healthful exercise for both students and experts. One of the 
more versatile tools for this purpose is the Vandermonde matrix and its determi- 
nant. This mathematical object has an honorable history dating from the late 18th 
century [1] and has enjoyed sporadic revivals of interest [3], [5]. In the past, most 
undergraduate major courses included its applications in the proof that values at 
n + 1 distinct points uniquely determine a monic polynomial of degree n [7] and in 
the definition of the signature of a permutation [4]. We present here a novel, 
indeed unexpected, application of the Vandermonde. 

A theorem of Abel [2] states: if P(x), Q(x) are any two polynomials such that 
deg O =n > 3, O has no multiple roots, and deg P =m <n — 2, then 


WEG 


where the summation is over all n distinct roots r; of Q. (As usual, Q’ denotes the 
derivative of Q.) 


Abel’s original proof (of a more complicated result) uses integrals. The modern 
standard proof is based on residue theory. Since deg P < deg QO — 2, 


P(z 
i ( dy for some constant C > 0 and R sufficiently large; 
Iz| = -nO(2) 
hence 
lim [ 2) 0. 
R>/|z|=R Q(Z) 
However, 
P(z a P(z) 
i = 2q7ri- {sum of residues of ———— inside |z| < R 
Izi-RO(z y O(z) 


and since P(z)/Q(z) has simple poles at the roots of Q the residues of P(z)/Q(z) 
inside |z| = R are precisely 


P(7;) tsof Qin |z| =R 
Or) 77 TOs Q in |z| = R}. 


The desired (A) follows. 


We have found an algebraic proof of (A) in the spirit of classical theory of 
equations. Strictly speaking it requires no complex analysis. Given polynomial 
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Q with deg Q =n >3 and distinct (real or complex) roots r,,r5,...,1,, aS- 
sume w.l.o.g. Q is monic so Q(x) = (x — r, Mx —1r,)+++ (x —4r,). Then Q(x) = 
Lda — rx — ry)+++Q@—7)-+++(x —4,), that is, each summand has one factor, 


(x — r,), omitted. Thus, 
Wri) = ECE 2) 0 OT aC A Nig) Cm Tn) 
= ($C Gi 2) Oe Di =) On =) 
TT - re) 


_j; |J>k 
= (-1)” ps 
I] (7; - r,) 
j>k 
J, k#i 


Recall now the Vandermonde determinant [6] 


1 1 1 
ay ay ay, 
2 2 2 
ay ay ay, 
Vi(a,,...,a,) =det] - 

n—-2 n—-2 n—-2 
ay ay, a, 

n—-1 n—-1 n—-l 
ay a4 ay, 

is a polynomial of degree n(m — 1)/2 in the n variables a,,...,a,; it can be 


written as the product (a, — a,)--: (a, —.a,_,) = Tj, ,(a,; — a,). Vla,,...,a,) 


= 0 if and only if a, =a, for some k # j. Moreover, the minors of entries in row 


n are themselves Vandermondes: in particular, the minor of a7! is 


V_fa,,...,4 a,,) which is precisely I],., ; ,~@; — a,). Hence, 


Vial ys +95 Tn) 
Vial 15-203 Fines eo Tn) 


Now given polynomial P with 0 <m=deg P<n-2 


°° e%9 


QO'(r;) = (-1)"" 


P(7;) - A Ce 
se —1)\" 'P r,) 
Lary BOD POT a) 
(-1)"" i-1 A 
= Vr,..ur,) OOP P(r, Vir 15-005 Fis Tn) 
1 1 1 
1 ry Po ry 
—1)\"— ; 
= SO) ay tet 
(Tiss soln) rr ra? a rn—2 
P(r,) P(r) P(r,) 
However, P(x) has degree <n-—2 so that each P(r,) is the same linear 


combination of 1, 7,,77,...,7/'~* and hence the determinant is zero, as desired. 


Note that this result (and Abel’s proof but not ours) is true when n = 2, m = 0; 
n=2, P=0; or n=1, P =O. These cases are easily proved by differentiation 
and substitution in (A). 
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We have noticed that recent texts ignore the Vandermonde so that even our 
advanced students have never heard of it, nor its discoverer, nor indeed the entire 
category of problems which drove the work of Lagrange, Galois, Abel and their 
heirs. The topic (along with the rest of theory of equations, now “lost’”) is well 
suited to independent study, a mini-course or a special project, if there is no room 
for it in the modern over-crowded pregraduate major sequence. 


We wish to thank the referee for a useful comment. 
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More on pi 


I may well have made a mistake in my note How to Make Pi Equal to 
Three jn the February 1992 issue of this Monthly, but it was not the 
mistake that Professor Dario Castellanos points out in his recent letter 
(Monthly, January 1993), because I did not take my ruler aboard the 
spinning circle. 

Professor Castellanos takes a ruler on board a spinning circle, and 
uses it to measure the circumference of a stationary circle. Naturally, he 
gets a value of 7 greater than usual. 

I used a stationary ruler to measure the circumference of a spinning 
circle, so I got a value of a less than usual. 

The trouble is I used special relativity instead of general relativity. A 
spinning circle is an accelerated system, which calls for general relativity. 
I assumed that for large radius we could approximate circular motion 
with straight line motion, and proceeded accordingly. 

But I never took my ruler on board the spinning circle. 

Along these same lines, here is a paradox I cannot explain. Suppose a 
train on a circular track is so long that reaches all the way around, and 
the caboose is hitched to the locomotive. If the train travels near the 
speed of light, each car decreases in length, so'we have a short train 
filling a long track. What happens? 

—Rick Norwood 

Department of Mathematics 
East. Tennessee State University 
Johnson City TN 37614 
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NOTES 


Edited by: John Duncan 


Embedding Countable Groups 
in 2-Generator Groups 


Fred Galvin 


The aim of this note is to popularize a simple proof, due to Neumann and 
Neumann [5], of the fact that every countable group is embeddable in a 2-genera- 
tor group. This was first proved by Higman, Neumann, and Neumann [2, Theorem 
IV] and (independently) Freudenthal [2, p. 254], using free products with amalga- 
mations. The proof given by Neumann and Neumann [5] used wreath products, 
which are widely regarded as no less terrifying than free products with amalgama- 
tions. I am indebted to Professors A. M. W. Glass, G. Higman, and P. M. 
Neumann, each of whom pointed out (in response to an earlier version of this 
note) that the Neumann-Neumann proof is really quite simple, and that it can 
easily be expressed directly in terms of permutations. Here, then, is a short proof 
that assumes no more background than is needed to understand the statement of 
the theorem. 

As usual, Z is the set of integers and N is the set of natural numbers; <a, b) is 
the group generated by a and b; Sym((Q) is the group of all permutations of a set 
Q; permutations are regarded as right operators, and are composed from left to 
right. 


Theorem 1. Every countable group is embeddable in a 2-generator group. 


Proof: Consider a countable group G = {g,, 83, g5,...}; the elements are 
indexed by odd positive integers. We may assume that G is a subgroup of 
Sym(N). Define permutations a and b in Sym(Z X Z X N) by setting (m,n, p)a = 
(m + 1,n, p) and 


(m,n+1,p) ifm=0; 
(m,n, p)b = { (M,N, D8 mn) if m isodd,m>0,n>0:;: 


(m,n, D) otherwise. 


Let b, = a'ba~ and 8, = b,b~'b; ‘b for i = 1,3,5,... . Straightforward (if slightly 
tedious) calculation shows that (m,n, p)g; = (0,0, pg;) if m=n=0, while 
(m,n, p)g; = (m,n, p) otherwise. Thus G = {g,: i = 1,3,5,...} is a subgroup of 
(a, b) isomorphic to G. 

It may have occurred to the reader to wonder whether Theorem 1 can be 
proved by just showing that any countable subgroup of a symmetric group S = 
Sym(Q.) is contained in a 2-generator subgroup of S. In fact, this was a question of 
Wagon [8]. An old theorem of Sierpifski [7, 9] says that any countable set of 
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selfmaps of an infinite set (1 is contained in the semigroup generated by two 
selfmaps of (2; Wagon asked whether one could replace “‘selfmaps” by “‘permuta- 
tions” in Sierpifski’s theorem. The answer is yes [1], but the proof is a bit more 
involved and will appear elsewhere. 

The two generators in the proof of Theorem 1 were both of infinite order. B. H. 
Neumann [4, p. 541] remarked that every countable group is embeddable in a 
2-generator group with generators of prescribed orders g > 8 and r > 2; this was 
improved by Levin [3] to g > 3 and r > 2, which is the best possible result in this 
direction. The proof of Levin’s result is a little too complicated to give here; 
however, we can get two generators of finite order by modifying the proof of 
Theorem 1. We need the following easy lemma: 


Lemma 1 [6, Exercise 10.1.17, p. 259]. Every permutation is the product of two 
involutions. 


Proof: It suffices to consider the case of a permutation consisting of a single (finite 
or infinite) cycle. Note, e.g., that a 6-cycle is obtained by multiplying the involu- 
tions (1, 2)(3, 4)(5, 6) and (2,34, 5). This example can easily be generalized to get 
cycles of any desired length. 


Theorem 2. Every countable group is embeddable in a 2-generator group with one 
generator of order 11 and the other of order 2. 


Proof: Let G be. a countable group. We may assume that G is a subgroup of 
Sym(N); moreover, by Theorem 1 and Lemma 1, we may assume that G is 
generated by four involutions, which we call g3, g;, g,, and go. Define permuta- 
tions a and b in Sym(Z,, X Z X N), of orders 11 and 2 respectively, by setting 
(m,n, p)a =(m + 1,n, p) and 


(m,n +(-—1)",p) if m = 0; 
(m,n —(-1)",p) if m = 1; 
(M,N, D&n) if m € {3,5,7,9}, n = 0 (mod 4), n = 0; 
(m,n, p) otherwise. 


(m,n, p)b= 


Let c = (baba™')*. Note that (0,7, p)c = (0,n + 4(—1)", p), while (m,n, p)c = 
(m,n,p) if m#0. Let b, =a'ba™ and 8,=b,c~'b,c for i = 3,5,7,9. Then 
(m,n, p)g; = (0,0, pg;) if m =n =0, while (m,n, p)g; = (m,n, p) otherwise. 
Hence (£3, 5, £7, 9) is a subgroup of (a,b) isomorphic to G. 


Theorem 3. Every countable group is embeddable in a 2-generator group with one 
generator of prescribed order q = 5 and the other of order 2. 


Proof: By Theorem 2 and Lemma 1, we may assume that the given countable 
group is a subgroup of Sym(N) generated by three involutions, which we call g,, 
g3;, and g,. Define permutations a and b in Sym(Z, x Z X N), of orders q and 2 
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respectively, by setting (m,n, p)a = (m + 1,n, p) and 


(m,n + (—1)",p) if m = 0; 
(m,n—-(-1)",p) ifm=1; 


(m,n, p)b = (m,n, pg>) ifm = 2,n = 0 (mod 24), n = 0; 
(m,n, pg3) if m = 3,n = 8 (mod 24), n = 0; 
(m,n, pg4) if m = 4, n = 16, (mod 24), n = 0; 
(m,n, p) otherwise. 


Let c = (baba~')*; then (0,2, p)c = (0,n + 8(—1)", p), while (m,n, p)c = 
(m,n, p) if m #0. Let b, =a'ba~ and 8, = c'~*b,c~*b,c?~ for i = 2,3, 4; then 
(m,n, p)g; = (0,0, pg;) if m =n = 0, while (m,n, p)g, = (m, n, p) otherwise. 


ACKNOWLEDGMENTS. I thank the editor and referee for their encouragement and advice. 
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Abelian Forcing Sets 


Joseph A. Gallian and Michael Reid 


Many readers of the MonrTHLY have encountered particular cases of the following 
question. Suppose G is a group and nv is an integer with the property that 
(ab)" = a"b" for all a and b in G. Which values of ” imply that G is Abelian? 
Indeed, standard exercises in undergraduate abstract algebra textbooks ((1], [2], [3], 
[4}) are to show that n = 2 and n = —1 are two such values. Are there others? If 
n & Z, we say that a group G is n-Abelian if (xy)” =x"y" for all x, y € G. Thus 
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our question may be reformulated as “for which integers nm is an n-Abelian group 
necessarily abelian?” If p is any prime, consider the non-Abelian group 


1 a ob 
G,=(\|0 1 ¢ a,b,c EZ,). 
0 0 1 


If p is odd, then x? =e for all x € G,. We say that a group G has exponent n if 
x” =e for all x € G. Thus, G, has exponent p. Also, G, (which is isomorphic to 
the group of symmetries of a square) has exponent 4. Note that if G is a group 
with exponent n, then for any integer k, G is kn-Abelian and (kn + 1)-Abelian. 
The examples G, are now sufficient to show that the only integers n for which 
n-Abelian implies Abelian are n = 2 and n= —1. Indeed, for p odd, G, is 
pk-Abelian and (pk + 1)-Abelian for any integer k, while G, is 4k-Abelian and 
(4k + 1)-Abelian. 

More generally, let us call a set of integers JT Abelian forcing if whenever G is a 
group with the property that G is n-Abelian for all n in 7, then G is Abelian. So 
far we have seen that the only singleton Abelian forcing sets are {—1} and {2}. 
What about other sets? Both of Herstein’s algebra textbooks ((3, p. 31] and [4, 
p. 57]) include the exercise that sets containing three consecutive integers are 
Abelian forcing. Moreover, one of Herstein’s books ((4, p. 57]) has an exercise that 
{3,5} is an Abelian forcing set. In contrast, the set {3,7} is not Abelian forcing, as 
G, is both 3-Abelian and 7-Abelian. 

What characterizes the Abelian forcing sets? Although we could not find the 
answer to this precise question in the literature, some of the essential features of 
our argument below can be gleaned from a paper by F. Levi [5] written in the 
group-theoretic language of fifty years ago. (Levi investigated the question of when 
the mapping a > a” is a group endomorphism.) Our formulation of the question, 
the answer and the proof make the material more accessible to undergraduates. 


Theorem. A set T of integers is Abelian forcing if and only if the greatest common 
divisor of the integers n(n — 1) as n ranges over T is 2. (Note that each n(n — 1) is 
even.) 


Proof: The necessity of the condition again follows from the examples G,. For p 
prime, let T, = {n € Z|2p divides n(n — 1)}. Then for p odd, T, = (pk, pk + 1| 
k € Z}, while T, = {4k,4k + 1|k € Z}. From our earlier observation, G, is n- 
Abelian for each n € T,, so T, 1s not Abelian forcing. This proves necessity. 

To prove sufficiency of the condition, suppose that T C Z satisfies gcd(n(n — 
1)|n € T) =2, and G is a group which is n-Abelian for all n € T. Let S = 
{n € Z|G is n-Abelian}, so that T C S. First note that if m,n € S, then mn € S. 
Also, if n € S, then for any x, y € G, we have (xy)” =x"y", so that (yx)""1 = 
x"—ly"-1! whence (yx)!~" = y!~"x!—"_ Thus, if n € S, then 1 —n €S. Since 
n = 1-—(1 —n), the converse holds as well. 

Our main difficulty at this point is that S is not closed under addition. However, 
suppose that m © S has the property that x” € Z(G) (the center of G) for all 
x © G. Then, for arbitrary m € S,x, y € G, we have (xy)”T" = (xy)"(ay)”" = 
xMy™x"y" _ x™x"ymy” = xMtnymrn soomtne S. 

This motivates the definition R = {n € S|x” € Z(G) for all x € G}. It is easy 
to see that n € R if and only if —n € R. Thus, from our previous remark, R is an 
additive subgroup of Z. We now claim that if n € S, then n(n — 1) € R. We do 

this in several steps. 
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Note that if nm € S, then 1 —n € S, so ni —n) € S. For arbitrary x, y € G, 
we have yx”"y"y~! = y(xy)"y~! = (yx)” =y"x", so that y!~"x" =x"y!-", Thus 
n-th powers commute with (1 — n)-th powers. Now, for any x € G, x"'~™ is both 
an n-th power and a (1 — n)-th power. Thus, for any y € G, x"'~” commutes 
with both y” and y'~", and therefore also with y. This shows that x"'-”™ © Z(G), 
so that n(1 — n) and thus, also, n(m — 1) are in R. 

We are now in position to prove sufficiency. Since the greatest common divisor 
of the numbers n(n — 1) for n € T is 2, the additive subgroup R of Z contains 2. 
Therefore, G is 2-Abelian, and thus Abelian. This proves sufficiency. 

Finally, to see that {n,n + 1,n + 2} is Abelian forcing, note that n(n — 1) — 
2(n + 1)n + (un + 2)(n + 1) = 2, so that 


gcd(n(n — 1),(n + 1)n,(n + 2)(n + 1)) =2. 


The authors recently discovered that the problem addressed here 
appeared as a problem in the MONTHLY in 1974 (E2411, vol. 81, 


page 410). It is also a special case of a result of L. C. Kappe, ““On 
n-Levi groups”, Arch. Math., 47 (1986) 198-210. 
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UNSOLVED PROBLEMS 
Edited by: Richard Guy 


In this department the. MONTHLY presents easily stated unsolved problems dealing 
with notions ordinarily encountered in undergraduate mathematics. Each problem 
should be accompanied by relevant references (if any are known to the author) and by 
a brief description of known partial or related results. Typescripts should be sent to 


Richard Guy, Department of Mathematics & Statistics, The University of Calgary, 
Alberta, Canada T2N IN4. 


Is There a k-Anisohedral Tile for k > 5? 


John Berglund 


Let JT be a monohedral (all tiles are congruent) tiling of the Euclidean plane [2, 
p. 20]. Let S(T) be the group of symmetries which map T onto itself. For a given 
tile T in 7 let the transitivity class of T be the collection of all tiles to which T can 
be mapped by,one of the symmetries of S(7'). If T has precisely k transitivity 
classes, call 7 k-isohedral. Since a picture is worth 10° words, let us look at an 
example. FiGurRE 1 is a tiling that is 1-isohedral. 1-isohedral tilings are also called 
merely isohedral. FiGurRE 2 is a tiling that is 2-isohedral. The shaded tiles form one 
transitivity class, and the unshaded tiles form another. 


Figure 1. 1-isohedral tiling. Figure 2. 2-isohedral tiling. 


If a tile permits a k-isohedral tiling but not any n-isohedral tiling for n < k, call 
the tile k-anisohedral. For example, the tile given in FIGuRE 3 is 2-anisohedral. 
The members of the shaded transitivity class bite the chins of the members of the 
unshaded transitivity class. The members of the unshaded transitivity class bite the 
noses. The tile in Figure 2 is not 2-anisohedral since it allows the 1-isohedral 
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Figure 4. 3-anisohedral tile. 


tiling given in FiGureE 1. 3-anisohedral tiles are rarer. As one example, take the 
Stein pentagon (Ficure 4) [2, p. 518]. 

4-anisohedral tiles are still rarer. The figure shown in FIGURE 5 seems to be the 
first published example. Note that the shape is made by joining nine equilateral 
triangles at the edges. 


Problem 1. Do there exist k-anisohedral tiles for every k? Examples have been 
found for k < 4. 


Problem 2. Characterize all k-anisohedral tiles for low values of k. This has been 
solved for k = 1 in Griinbaum and Shephard [1]. The general problem has not 


been solved even for k = 2. 


To give the reader a taste of these topics, two 4-isohedral tilings are given in 
Ficures 6 and 7. Are these shapes 4-anisohedral or not? The shape in FiGurE 6 is 
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Figure 6. Hexahexe in a 4-isohedral tiling. 
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Figure 7. Modified 9-iamond in a 4-isohedral tiling. 


made by joining six regular hexagons. The shape in FiGurE 7 is based on a shape 
made of nine equilateral triangles joined. edge to edge; but the edges have been 
replaced with centrosymmetric curves. 
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Indiana Academy 
Cicero, IN 46034 


‘...She knew only that if she did or said 
thus-and-so, men would unerringly respond 
with the complimentary thus-and-so. It was 
like a mathematical formula and no more 
difficult, for mathematics was the one subject 
that had come easy to Scarlett in her school- 


days.” 
From Gone With the Wind 
by Margaret Mitchell 
Submitted by Steven C. Althoen 
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PROBLEMS AND SOLUTIONS 


Edited by: 
Richard T. Bumby, Fred Kochman and Douglas B. West 


Proposed problems should be sent to the MONTHLY PROBLEMS address given on 
the inside front cover. Please include solutions, relevant references, etc. Three copies 
are requested. 


Solutions of published problems should arrive before November 30, 1993 at the 
MONTHLY PROBLEMS address given on the inside front cover. Solutions should be 
typed with double spacing, including the problem number and the solver’s name and 
mailing address. Two copies suffice. A self-addressed postcard or label should be 
included if an acknowledgment is desired. 


An asterisk (* ) after the number of a problem, or part of a problem, indicates that 
no solution is currently available. Partial solutions will be useful in such cases. 
Otherwise, the published solution is likely to be based on a solution which is complete 
and correct. Of course, an elegant partial solution or a method leading to a more 
general result is always useful and welcome. In addition, references to other 
appearances of MONTHLY problems or to solutions of these problems in the 
literature are also solicited. 


PROBLEMS 


10314. Proposed by Andrew Vince, University of Florida, Gainesville, FL. 


Let b be an integer greater than 1. Let S be a set of integers containing 0 such 
that no two members of S are congruent modulo b. If 


with s; = S, prove that all s, = 0. 


10315. Proposed by Mowaffaq Hajja, Yarmouk University, Irbid, Jordan. 


Let A and B be matrices with integer entries of sizes r by n and n by r, 
respectively, with r <n. Suppose that AB is an r by r identity matrix. Show that 
A can be enlarged to an n by n integral matrix having an integral inverse. 


10316. Proposed by Richard K. Guy, University of Calgary, Calgary, Alberta, 
Canada, and Richard J. Nowakowski, Dalhousie University, Halifax, N.S., Canada. 


For what pairs of integers a,b does ab exactly divide a” + b? + 1? 
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10317. Proposed by Juan Bosco Romero Médrquez, Universidad de Valladolid, 
Valladolid, Spain. 


Lets A ABC be inscribed in a circle @ and let A’, B’,C’ be the midpoints of the 
arcs BC, CA, AB, respectively. 

(a) Prove that the incenter of AABC is the orthocenter of AA’B’C’. 

(b) Prove that the pedal triangle of AA’B’C’ is homothetic to AABC 


10318. Proposed by William P. Wardlaw, United States Naval Academy, Annapolis, 
MD. 


Suppose that A is an n by n matrix with rational entries whose multiplicative 
order is 15; ie. A’ = J, an identity matrix, but A* # J for 0 < k < 15. For which 
n can one conclude from this that 


I+A+A%7+-°--4+A4 = 0? 


10319. Proposed by Nick MacKinnon, Winchester College, Winchester, U.K. 


Define 
nr 
S,(n) = >> sin(r*). 
r=1 

Examination of graphs of S,(n) as a function of n for 1 <k <2 reveals some 
striking patterns. For example, when k = 1.4 the graph divides into clearly defined 
regions: between n = 1 and n = 36, one has ~.5 < S,,() < 3; then the value of 
the function changes rapidly from S, ,(36) ~ 2.95 to S, ,(49) =, —5.7; then one has 
-6.2 < S$, ,(m) < ~1.4 as n goes from 49 to 225; then there is a rapid increase 
from S, ,(225) = —6.2 to S,,(257) = 15.7. This pattern persists as far as graphs 
have been drawn. 

(a) As a first step to understanding this phenomenon, determine the locations of 
the jumps in S, ,(”). 

(b) Show that similar behavior may be expected for all k with 1 < k < 2, with 
the flat regions being longer for k close to 1 and shorter for k close to 2. 


10320. Proposed by Ignacy I. Kotlarski, Oklahoma State University, Stillwater, OK. 


Under the assumption that f,, f, and f, are defined on [0,), Laplace 
transformable, and not equivalent to zero, solve the integral equation 


[me Foe) Bry — 0) fal ta — 2) de = em AN(L — em minced), 
0 
with x, => 0Oand x, > 0, for the three functions f,, f, and fp. 
10321. Proposed by Carl Axness, Sandia National Laboratories, Albuquerque, NM, 


Reinhard Schafke, University of Essen, Essen, Germany, and David Arterburn, New 
Mexico Tech., Socorro, NM. | 


Let yw be a positive real number. Prove 


° iG 
lim (In x)'/" ye an comme == PC/h) 
x—1it* i=l 2b 
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NOTES 


Notes: (10317) The incenter of a triangle is the center of its inscribed circle, and 
the othocenter is the point of intersection of its altitudes. The feet of the altitudes 
of a triangle are the vertices of its pedal triangle. (10319) See the cover! 


SOLUTIONS 


Repeated Cyclotomic Factors 


E 3442 [1991, 438]. Proposed by Ray Wylie, Furman University, Greenville, SC. 


Given a sequence {b,}”°_, of real numbers such that b, = 0 for n sufficiently 


large, put B, = L?_5b,44; (Ss = 0,1,...,m — 1), and let us say that the sequence 
{b,}°_,) has property P,,, where m is a positive integer, if 


B,) =B,= °°: =B 


m-—-l1°* 
Suppose A1,Qy,...,4, are positive integers, not necessarily distinct, and let 
C(n) be the number of r-tuples (1,,7,,...,n,) of integers such that 


Nn=n, +n, +++: +n O<n,<a, fori =1,2,...,r. 


r? 


Prove that the k sequences 


(C(n)} =o (nC(n)}in-os--- {n*!C(n)} 0 


all have property P,, if and only if at least k of the integers a,,a,,...,a, are 
congruent to —1 modulo m. 


Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, The 
Netherlands. We consider m > 1, and let £ = exp(27i/m). Let B(x) = XL? _,b,x” 
and F(x) = iy Bx! Observe that B(¢‘) = F(¢‘) for each t € {1,2,...,m — 1}. 
Therefore, if {b,}°_, has property P_,, then B(¢‘) = 0 for each ¢ € {1,2,...,m — 1} 
and B(x) is divisible by 1 + x + --- +x™71'. Furthermore, if B(¢‘) = 0 for each 
t € {1,2,...,m — 1}, then since F(x) is a polynomial of degree m — 1, F(x) must 
be a constant times 1 + x + --: +x’! so that {b,}”_, has property P,,. 

Observe that the polynomial C(x) = ©?_,)C(n)x” satisfies 


C(x) = Tate+ soe $x), 


Also, for every integer ¢t > 0, the polynomial ©” _)n’C(n)x” is a linear combina- 
tion of the polynomials 


C(x), xC’(x), x*C"(x),..., x°CO(x). 
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Hence, if the k given sequences all have property P,,,, then the previous paragraph 
implies that all the polynomials C(x), C’(x),...,C“~ (x) have at least one factor 
1+x++-- +x™”7! which implies that C(x) is divisible by (1 + x + +--+ +x7!), 
Hence, ¢ is a zero of C(x) of multiplicity at least k. Since each factor 1 + 
x + +++ +x% has only simple zeroes, it follows that a; + 1 is a multiple of m for at 
least k values of i. For the converse, observe that if a; + 1 is a multiple of m for 
at least k values of i, then C(x) is divisible by (1 +x + --- +x™7!')*. We can 
conclude that £‘ is a root of L*_ n’C(n)x" for each j € {0,1,...,k — 1} and 
each ¢ € {1,2,...,m — 1}. The converse now follows from the previous paragraph. 


Solved also by D. Callan, R. J. Chapman (United Kingdom), M. Dindos (Slovakia), K. S. Kedlaya 
(student), N. Komanda, and the National Security Agency Problems Group. One incorrect solution was 
received. 


Hardy’s Inequality for Geometric Series 


6663 [1991, 559]. Proposed by Walther Janous, Ursulinengymnasium, Innsbruck, 
Austria and the editors. 


Show that 
lt+x+x74+ wae +yxJ—1 2 
» < (4 log 2)(1 + x2 +4 + .-. +x7N~2) 
j=1 J 


for 0 <x < 1 and all positive integers N; also show that the constant 4 log 2 is best 
possible. (If we drop the factor log 2, we have a special case of Hardy’s inequality; 
see Hardy, Littlewood, and Pélya, Inequalities, pp. 239-242.) 


Solution by Rolf Richberg, RWTH Aachen, Aachen, Germany. Observing that for 
0<x<landNEN 


11 (st 1 j-1 N 1—xi\’ 
fp wa- Lf foo isa = 5 | 


x "xX jJ=1 


we may reformulate the assertion as 


(st) dsdt 1—-x 


1pil- 
JJ 1—x*% 1- 


XxX xX 


(0<x<1,NEN), (1) 
which obviously would follow from 


aie 


In order to prove (2) we consider the double integral: 


1 ds dt 


yrs f'|-= tox ~ st) 


1 dt x dt 
-f log(1 — t)— + J ,loa(1 — t)— 


~ < (4log2)——~ (0 <x <1). (2) 


S=1 


dt 


SHX 


dt dt 
-2 log(1 -t)— +f log(1 - 1). 
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Substituting ¢ = 7” in the last integral and noting log(1 — +7) = log(1 — 7) + 
log(1 + 7), we obtain 


> as dt 


ff low = 2flog 1 + 0) 


thus transforming (2) into 


"| 1 + a 21 2)-_* - 0 1 3 
< <x <1). 
J los(1 + 1) < (2log2)——— (<x <1) (3) 


Now, differentiating with respect to ¢ shows that (1 /t)log(1 + t) is a decreasing 
and (1 + (1/t))log(1 + ¢) an increasing function of ¢ € (0,1). For 0 <x <1 we 
therefore have 


1 dt 
J log(d + 1)— < (1-2) 
1-x 
14x 


log(1 + x) 


1+ ~\tog(1 * * 5 10g? 
+ — +x)< og 2, 
2 +9 <a 
which proves (3). 

Suppose (1) is valid with 4log2 replaced with a constant c. Taking the limit for 
N — o we then get 


1 ds dt 1-x 
2 loa +) = ff OS se3 lax: (0<x <1). 


In 


1 
log(1 + t)— 
—! og )“ < 1+x 
(which comes from the transformation used to obtain (3)) let x tend to 1. It follows 
that 2log2 < c/2, ie., c = 4log2. Thus, the constant 4 log 2 is best possible. 


Solved also by H. Morris. 
An Extremal Set Problem 


E3459 [1991, 754]. Proposed by Constantin Adrian, Timisoara, Romania. 


Suppose X is an n-element set, n > 12, and suppose F is a family of 4-element 
subsets of X such that the intersection of each pair of distinct sets in F has at 
most two elements. Prove that there is a subset S of X containing at 
least (6n — 6)'/? elements such that none of the 4-element subsets of S is in the 
family F. 


Solution by Fred Galvin, University of Kansas, Lawrence, KS. Call a subset of X 
independent if it contains no member of F. Assuming n > 3, we show that every 
maximal independent subset of X has size greater than (6n)!”°. 

Let S be a maximal independent set, with k = |S|. Clearly k > 3. Since S is 
maximal, for each x € X — S there is a 3-element set f(x) C S such that f(x) U 


{x} € F. The condition on F implies that f is injective. Hence n — k < ; , or 


én < 6(«} + 6k =k? — 3k2 + 8k < k3 — 3, and so k > (6n + 3)". 


Editorial comment. Solvers proved various inequalities of the form k > (6n + 
c)'/3 for n > Nv»; by choosing n, large enough, c can be made arbitrarily large. 
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Fred Galvin noted that this problem is a special case of a class of problems 
discussed (but not explicitly solved) by Paul Erddés in ‘Problems and results on 
graphs and hypergraphs: similarities and differences,” in Mathematics of Ramsey 
Theory (J. Nesetril and V. R6édl, eds.), Springer-Verlag, 1990, 12-28. The function 
h(n, p,q) is the maximum integer m such that if each r-element subset of an 
n-set X is colored red or blue, then there exists a p-element subset of X 
containing at least g red r-sets or an m-element subset of X whose r-sets are all 
blue. The statement of this problem is h,(n, 5,2) > (6n — 6)'”° for n > 12. 


Solved also by G. Calinescu (student, Romania), R. J. Chapman (U. K.), J. R. Griggs, R. Jeurissen 
(the Netherlands), I. Kastanas, O. P. Lossers (The Netherlands), B. Peterson, P. Tracy, and the 
proposer. 


Source-Even Orientations of Graphs 


E 3462 [1991, 755]. Proposed by J. J. Rotman, University of Illinois at Urbana, 
Champaign, IL. 


Prove that any connected simple graph with an even number of edges has an 
orientation (assignment of direction to each edge) such that the number of edges 
leaving each vertex is even. 


Solution I by Richard Holzsager, The American University, Washington, DC. 
Suppose the edges of the graph are oriented to minimize the number of vertices 
that are sources of an odd number of edges. Since the total number of edges is 
even, there must be an even number of such vertices. If this even number is 
positive, find a path connecting two of these vertices (guaranteed by the graph 
being connected) and reverse the orientation of each edge in the path. This does 
not change the parity of edges leaving any intermediate vertex along the path, but 
it changes the endpoints from odd to even, contradicting minimality. 


Solution IIT by Jerrold Grossman, Oakland University, Rochester, MI. It is 
well-known that every connected graph with an even number of edges can be 
decomposed into edge-disjoint copies of P;, the path containing two edges. (See, 
for example, exercise 8.21a in G. Chartrand and L. Lesniak, Graphs and Digraphs 
(Second edition), Wadsworth, 1986.) Given such a decomposition, we orient the 
edges of each P, from its center toward its endpoints. This orientation has an even 
number of edges leaving every vertex. 


Editorial comment. (1) Many solvers noted that simplicity of the graph is not 
necessary; neither of the above proofs requires this assumption. (2) Many solvers 
also mentioned the following easy extension to connected graphs with an odd 
number of edges: For any vertex v in such a graph, there is an orientation such 
that the number of edges leaving v is odd and the number of edges leaving every 
other vertex is even. (3) F. Galvin, J. Conklin, and E. Stone proved that the 
number of orientations having the desired property is exactly 2”~"*'!, where m is 
the number of edges and n is the number of vertices. (4) F. Galvin offered an 
extension to infinite graphs: Let G be an infinite graph and let V, be the set of 
vertices of finite degree. Then, for any mapping p: V, — {0,1}, there is an 
orientation of G such that, for every vertex v € V,, the number of edges leaving v 
has the same parity as p(v). 


Solved also by 46 others and the proposer. 
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A Permutation on the Cube 


6670 [1991, 862]. Proposed by R. H. Jeurissen, Toernooiveld, Nijmegen, The Nether- 


lands. 

Let {0, 1}” denote the set of n-bit strings of zeros and ones. If (a,,...,a4,) € 
{0, 1}”, let 7,,(a,,...,a,,) be the string (b,,..., 5,) given by b, = a, and b, =a, + 
a,—, (mod 2) for 1 < k <n. Since (a,,...,a,,) can be retrieved from (b,,..., b,,), it 


is clear that 7, is a permutation of {0,1}”. Determine the cycle structure of the 
permutation 77,, i.e., the lengths of the cycles that occur and the number of cycles 
of each length. 


Solution by Thomas Honold, Technische Universitat Mtinchen, Mtinchen, Ger- 
many and Sonja Maus, Bonn, Germany (independently). The cycle lengths are 
powers of 2. If c, denotes the number of cycles of length 2*, then 
2 ifk =0 
2-k(22* 2") if2<2* <n 
2-k(2"9 - 2") ifn <2* < 2n 
0 if2* > 2n 

To establish this result, identify {0,1}” with the vector space V of n-tuples over 
GF(2), the field with two elements. Then 7, is an automorphism of V, and 
7, = 1+ N, where I is the ieentity operator and JN is the shift operator defined 
by N(a,,...,4,) = (0, a,,...,4,_1). Note that N is nilpotent, with N” = 0. Since 
we are working over GF(2), we > have aT, =I+N 2" for every integer positive k. It 
follows that the order of 7, is a power of 2 and hence so is the length of every 
cycle of 7,,. Now vu € V is fixed by 7, if and only if v € ker(N), and vu lies ina 


cycle of length 2* with k > 1 if and only if v € ker(N2') — ker(N2"). The 
result now follows from the observation that dim(ker(N’)) = min{n, /}. 


CC. = 


Editorial comment. J. C. Binz and Tad White (independently) extended the 
result to the analogous automorphism of the vector space of n-tuples over GF(p) 
for any prime p. Arthur Woerheide extended it even further to the analogous 
automorphism of G” where G is any finite-dimensional vector space over GF‘ p). 
These extensions are easily derived by appropriate modifications to the given 
solution. 


Solved by 30 solvers (including those cited) and the proposer. 
Deferred Cesaro Means 


10217 [1992, 362]. Proposed by Brian Philp, The University of Birmingham, Birming- 
ham, England. 


Suppose {a py , is a sequence of complex numbers. 


(a) Prove that if n-')i" a, >A and n-'Li" a, > 3A, then n~'a, > 0. 


(b) Is it true that if n~'L3",a; > 2A and n-'L" a, > 8A, then n~'a, — 0? 


Solution by Robin J. Chapman, University of Exeter, Exeter, U. K. 


(a) By replacing a, by a, + A we may, and shall, assume that A = 0. Let 


=y-! 
B, =n -a,. Now 


A>, 1 2n 9) 1 4n 
HH Fat Ea Yay 
n MN jan 2n j= 2n N jan 
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and so B,, > 0 as n > ~. It follows that 6, =n 'Lit7 ta, > 0 as n > ©, Now 
(n + 16,41 ~ no, = Con+1 + Aon ~~ an and SO 


n+1 n 
Bont = Fp 47 ott” Ong de" Ine 102" * In Pe 
nN 
= %m + oy Pe 


where y, > 0 as n > », Hence |B,,,,| < ly,| + 18,|/2. Given « > 0 choose N 
such that ly,,1B,,|<« if n> N. Let K = max(|Bql,|Byii),---,|BonD. I claim 
that if 2N<m<4N then |8,|< «+ K/2. This is trivial if m is even and if 
m = 2n + 1 is odd then |B,,| < ly,| + |6,|/2 < ¢ + K/2 as required. If we define 
f(K)=e+K/2 then iterating this argument gives |£,,| < f7(K) (the r-fold 
iterate of f, applied to K) for 2"N <m < 2’*'!N. Now for fixed « and large r, 
f7CK) > 2¢ and hence |6,,| < 3¢, as required. 

(b) The answer is “no.” I claim we can choose the a,,,, for k = 0, arbitrarily 
so that L?",a, = L/",,a; = 0. We make a, = 0 for all k > 1 and if k > 0 define 
3,4 for k > 0, recursively by a3,4. = —Lj~Z41a;. It is now clear that L*",,a, = 
Li" ,a; = 0. But choosing say a3,,, = k* we can make {n~'a,}?_, unbounded. 

Editorial comment. The problem arose in connection with noticing that the 
claimed necessary and sufficient condition 


An 


— ia; > (A-1)s asn—oforsome A > 1 
n.: 
jan 


for Cesaro summability given as ‘analytic background” in Theorem IV of N. H. 
Bingham, “On Tauberian theorems in probability theory”, Nieuw Arch. Wisk. (4) 3 
(1985), 157-166 was incorrect. Part (b) provides a counterexample to this state- 
ment. In general, a second value of A > 1 is needed. A proof of the positive result 
given in part (a) due to Prof. B. Kuttner was provided by the proposer. 


Solved also by M. Dindos (Slovakia), N. J. Fine, K. S. Kedlaya (student), O. P. Lossers (The 
Netherlands), M. Mocsy (Hungary), and R. Stong. One solution dealing only with part b, one for which 
only part b was correct, and one incorrect solution were also received. 


Collaborating editors: David F. Appleyard, Paul T. Bateman, Bruce C. Berndt, 
Duane M. Broline, Barry W. Brunson, Frank S. Cater, Gulbank D. Chakerian, 
Underwood Dudley, Gerald A. Edgar, Michael A. Filaseta, Ira M. Gessel, Richard 
A, Gibbs, Jerrold R. Griggs, Douglas A. Hensley, John R. Isbell, Mourad E. H. 
Ismail, Murray Klamkin, Daniel J. Kleitman, Frederick W. Luttmann, Frank B. 
Miles, Richard Pfiefer, Stephen L. Portnoy, J. O., Shallit, John Henry Steelman, 
Kenneth B. Stolarsky, David E. Tepper, Douglas B. Tyler, Daniel Ullman, Edward 
T. H. Wang, and William E. Watkins. 


Answer to Picture Puzzle: 
(p. 538) 


John Littlewood, sometimes described as the name Hardy in- 
vented for a collaborator. 
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Thomas Archer Hirst— 
Mathematician Xtravagant 
III. Gottingen and Berlin 


J. Helen Gardner and Robin J. Wilson 


ee 


J carry with me all manner of letters of introduction to Gottingen, but the making of new 
acquaintances is ever a task to me; and for my own part I would rather have dispensed with 
extraneous help in doing so. It is a worse thing to be over- than under-estimated, and there is 
ever far more satisfaction in silently carving one’s own path than in having it made ready and 
carpeted for us... 


The University of Gottingen 


With the completion of his Ph.D. thesis in Marburg, Thomas Hirst decided to 
travel, making Gottingen his first port of call. Here he spent two weeks at the 
University, attending lectures and conducting magnetic experiments with the 
physicist Wilhelm Weber, ‘a curious little fellow [who] speaks in a shrill, unpleas- 
ant and hesitating voice’. He also attended Moritz Stern’s ‘beautifully clear’ 
lectures on integral calculus and mechanics, and was much impressed by him. 
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6th August 1852: To-day I called on Weber again. He received me kindly and explained to me a 
new method of his of determining the inclination of the Magnet. He speaks and stutters on 
unceasingly; one has nothing to do but to listen. Sometimes he laughs for no earthly reason, and 
one feels sorry at not being able to join him. At 2 p.m. I went with him to make some 
experiments—he, I and another student made a determination of the Magnetic Inclination. I 
read off the ‘“Auschlags Winkel’ [angle of inclination] and though my first attempt, we got a 
result of 67° 26’—that is within the daily variation... 

Stern is a stern fellow, a firm, rub-against-able fellow—not an atom of unnecessary ceremony 
about him, but the greatest plainness and character. We got a bit of brown bread and butter 
together, as he would have taken himself, with a glass of water to it, and then a cigar. At first we 
talked about a dark point in his lecture, then on a multitude of topics. Stern is a widower with a 
family, a housekeeper; I believe his wife destroyed herself. At any rate, she went insane and 
either killed herself or died. It is said that for long after he was a misanthrope—one can see 
several deep wrinkles of sorrow in his face, though his manner is now gentle and quiet... 


Carl Friedrich Gauss (1777-1855) on the terrace of Gottingen University 


But the highlight of his Gottingen trip was a visit to Carl Friedrich Gauss, which he 
recorded in great detail. 


12th August 1852: ...Personally he is a venerable, fine old fellow, with a contented manly 
expression. There is an extraordinary aspect of power about him and his every word: without 
effort he suggests to every one the presence of manly might. He is about 80 years of age, but not 
a trace of superannuation is to be seen about him. He can even read without spectacles. 
Although our interview as far as the conversation was concerned was not brilliant or extraordi- 
nary, for in it there was no effort on either side, yet for remembrance sake I will try to relate it. 
No sooner was the first word spoken than I felt perfectly at ease, and he pointed for me to sit on 
the sofa and took a chair close by me. We spoke of course all in German, though he can speak 
English. 

Tom. I am sorry, Professor, that I have not had the opportunity of hearing a lecture from you 
during my stay in Gottingen. It was, in fact, one of my principal motives for coming—a kind of 
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curiosity perhaps it may be called, yet for a lover of science I hope at least an excusable one. 
Gauss. Ah, this semester few students announced themselves, and at my age, with other work to 
be yet finished by me, and the hot summer before me, I was glad rather than otherwise to be 
dispensed from the task. 

Tom (after a short silence). Have you ever been in England, Professor? 

Gauss. No, I never got further than Belgium, and now the difficulties of the journey, as well as 
the change of life and habits render it impossible for me. 

Tom. It is true this difference in habits and life affect even younger and stronger persons. I 
suffered somewhat myself from the same cause. 

Gauss. Yet, considering the great esteem I have for the English as a nation, which we may 
consider a model for us as far as steady, persevering toil and firmness of character are 
concerned, it is now strange to me that in my younger days I never visited it. 

Tom. You have, however, the consolation to know that it was your own work that prevented you. 
Yet there is something about your German life (especially student life) that so far excels English 
as yearly to draw more and more of us to your universities... 

So we chatted on quite comfortably for three-quarters of an hour, and then I bid the old 
veteran good-bye and thanked him heartily. I left him copies of some of Tyndall’s memoirs and 
of my own dissertation. 

As a mathematician, Gauss is without doubt our Sir Isaac Newton. Perhaps no one ever had 
a firmer reliance in the absolute truth of mathematics, or lived more in it. It is to him the 
foundation of the universe on which God himself has built... 


The University of Berlin, now known as Humboldt University 


After leaving Gottingen, Hirst spent several weeks travelling around Germany and 
Austria with his brother John and some friends from Marburg. He then moved to 
Berlin, where he spent the winter semester, from October to April. On his arrival, 
he went to visit the algebraist Ferdinand Eisenstein, ‘a young and highly promising 
mathematician’. Unfortunately, Eisenstein had died just the day before. This, 
understandably, upset Hirst considerably. 
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12th October 1852 ...A young, able fellow, cut down the moment he was making his ability 
known and useful—a fellow of deep intellect and great industry, too, as late journals can show. 
We entered joyfully, thinking to see him and know him; we left it awe-struck, silent and sad. 


The next morning he called on Lejeune Dirichlet, the distinguished analyst and 
number theorist, and ‘met with a very hearty reception’. 


13th October 1852: He is a rather tall, lanky-looking man, with moustache and beard about to 
turn grey (perhaps 45 years old), with a somewhat harsh voice and rather deaf: it was early, he 
was unwashed, and unshaved (what of him required shaving), with his “schlafrock”’, slippers, cup 
of coffee and cigar... I thought, as we sat each at an end of the sofa, and the smoke of our 
cigars carried question and answer to and fro, and intermingled in graceful curves before it rose 
to the ceiling and mixed with the common atmospheric air, ‘If all be well, we will smoke our 
friendly cigar together many a time yet, good-natured Lejeune Dirichlet.” 


Although he continued to read widely in all branches of mathematics, he increas- 
ingly felt the need to further his knowledge of geometry. He was particularly 
interested in the relationship between synthesis and analysis. 


15th October 1852: After having purchased three valuable volumes, Carnot’s ““Geometrie de 
Position”, Gauss’s “Theory of Numbers” (in French) and Cauchy’s application of Diff: Cal: to 
Geometry, I find my hands full of work. My own books have also arrived from Marburg; how 
dependent one is on books. I felt lost without them, and had to ask myself: ‘““Tom, hast thou 
nothing then in thee, but must be strung and wound up before thou canst begin playing?’... 


18th October 1852: ...In Carnot’s ““Geometrie de Position” with which I am now engaged, after 
an able discussion of the comparative merits of and distinctions between so-called analysis and 
synthesis occurs the following rather noteworthy paragraph: 

“Synthesis is not exclusively applied to mathematics—it is in general the art of reasoning 
with justice: whatever may be the subject of argument. It is identical with what is termed 
dialectics... Analysis proceeds generally differently, in that series of transformations are made 
on truncated parts of the discourse, and taken isolately are unintelligible, but which submitted 
like the others to the mechanism of argumentation can by a new series of transformations lead to 
clear and precise results—as much so, indeed, as those deduced synthetically...” 

... Carlyle tells me that just at this time he [Carnot] was taking his part in the French 
Revolution—who knows but he may have written it after that memorable dinner party at which 
he and Robespierre were with others present, when Carnot slipped out of the room, searched 
Robespierre’s pocket that he had laid aside, and found therein a sentence of death for himself 
and others with whom that Robespierre had been chatting quite coolly... 


It was not long before the lectures began. Hirst was particularly impressed by those 
of Jakob Steiner, both for his pure synthetic approach to geometry (which ac- 
corded with Hirst’s own views) and also for his attitudes to education, which he 
acquired while studying at the school of the Swiss reformer Pestalozzi. 


28th October 1852: ...I have heard Steiner twice, and am well pleased with him. He is a 
middle-aged man, of pretty stout proportions, has a long, intellectual face, with beard and 
moustache, and a fine prominent forehead, hair dark and rather inclining to turn grey. The first 
thing that strikes you on his face is a dash of care and anxiety almost pain, as if arising from 
physical suffering. Before starting he sets his chair right, looks all round, finds the window must 
be opened, and with difficulty gets started. Then in a short time he will ask them to close the 
window within a hand’s breadth, for he has rheumatism. All these point to physical nervous 
weakness. His Geometry is famed for its ingenuity and simplicity—he is an immediate pupil of 
Pestalozzi: in his youth was a poor shepherd boy, and now a professor. His argument is that the 
simplest way is the best; he tries ever to find out the way Nature herself adopts (not always, 
however, to be relied upon). Mathematics he defines to be the “Science of what is self- 
evident’... 
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But most of all, he admired the teaching of Dirichlet: 


31st October 1852: Dirichlet cannot be surpassed for richness of material and clear insight into 
it: as a speaker he has no advantages—there is nothing like fluency about him, and yet a clear 
eye and understanding make it dispensable: without an effort you would not notice his hesitating 
speech. What is peculiar in him, he never sees his audience—when he does not use the 
black-board at which time his back is turned to us, he sits at the high desk facing us, puts his 
spectacles up on his forehead, leans his head on both hands, and keeps his eyes, when not 
covered with his hands, mostly shut. He uses no notes, inside his hands he sees an imaginary 
calculation, and reads it out to us—that we understand it as well as if we too saw it. I like that 
kind of lecturing. 


Lejeune Dirichlet (1805-1859) Jakob Steiner (1796-1863) 


While enjoying the lectures of Steiner and Dirichlet, he was also developing their 
friendship. His social calls became increasingly frequent, and his diary entries 
around this time seem to alternate between the two. However, the two men could 
not have been more different, and this difference shines out of the words—the 
grumpy and ailing Steiner, with ‘a power of insight possessed by no other living 
geometer, perhaps’, and the genial Dirichlet, with whom he was becoming ‘on 
terms of perfect friendship’. But, for all this, he seems to have been fond of both of 
them, and any adverse comments were statements of fact, as he saw it, rather than 
condemnations. 


7th November 1852: ...To-day I called on Prof. Riess of the “Konigliche Academie der 
Wissenschaft”... Riess is a delicate, good man, with clear, deep insight. I listened with great 
interest to his talk about Dirichlet, Jacobi and Steiner. He told me fully the relations on which 
the latter stands with them all, and truly it is unexplainable. Riess says his vulgarity has by them 
all been slightly borne in consideration of his undoubted genius. But that some time ago without 
provocation Steiner cut them all. The probable reason is that Steiner, naturally of a testy 
disposition, which has been increased, too, by bodily illness, feels himself slighted that he has 
been 33 years “‘Ausserordentliche” [Extraordinary] Professor. The reason is clear: firstly he does 
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not know Latin, and that among German professors is held as a necessity: 2nd he is so terribly 
one-sided on the question of Synthetical Geometry that as an examiner he would not be liked. 
The more I hear, the more I am determined to see him and study him for myself. 


14th November 1852: ... Wednesday evening I spent with Dirichlet: saw Mrs Dirichlet again, 
found she was sister to Mendelssohn——she played me several of her brother’s pieces, to which I 
listened with great willingness. 


21st November 1852: On Tuesday I called again to see Steiner. He came to me first in his 
ante-room, when we had a little interesting talk on his system of Synthetical Geometry, and its 
relation to Analysis. The latter he would by no means annihilate, and pleads justly that 
heretofore it has but had too great pre-eminence to the detriment of Synthesis. I mentioned that 
I had in view to translate his work into English—the old fellow’s indifference towards me has 
been somewhat relaxing before, and this was the finishing stroke... 


Despite his involvement with mathematics, Hirst continued to keep us his interests 
in the sciences, attending a lecture by the distinguished geologist Christian von 
Buch, and admiring a model of Foucault’s pendulum, introduced two years 
previously for demonstrating the rotation of the Earth. | 


19th December 1852: On Thursday du Bois [Reymond] came to take me to the K6nigliche 
Academie. Old von Buch was reading a paper on the chalk formations of America. He is also a 
fine old fellow, with a stern iron face... Steiner walked in with a very self-possessed, glum face, 
which relaxed into a smile and a bow as he passed me near the door. He then stood at one 
corner of the table, took a large, deliberate pinch of snuff, and eyed the assembled company... 
Buch was sitting reading his paper, and finally Steiner came to a standstill with his back against 
the stove, took snuff very largely, and seemed to have no connection with anybody. 


13th February 1853: Monday at Magnus’s—a party of scientific ladies and gentlemen there. The 
evening was spent by Magnus shewing us several experiments. Magnus was shewing a model of 
Foucault’s experiment with the pendulum; as he said, for Dirichlet’s and my especial interest. 
The only remark the ladies made on the matter (and they always make some) was that motion of 
the pendulum was “sehr gracieuse.”’ 


His studies took most of his waking hours, but it wasn’t all work and no play. Hirst 
took advantage of the crisp winter weather to pursue a ‘favourite amusement’. 


20th February 1853: This has been a winter’s week indeed—10° below Zero in Baumer. Cabs 
has been almost entirely replaced by sledges, and the streets are in a continual tinkle, tinkle. I 
have for the first time in my life had a ride in one, on Wednesday, and very easy riding I found 
it. On Wednesday, and this afternoon, I had some skating for the first time in Germany— 
indeed, for many years. Swimming in summer and skating in winter are the two greatest physical 
enjoyments I indulge in, and I have a weakness for both. The scene on the ice here in the 
Thiergarten is novel and attractive to me. There are ladies by the score skating beautifully... 
Prof. Mitscherlich’s daughter was skating gracefully past me, and I, rogue that I am, was looking 
at her skates (and the ankles to which they were fastened) instead of my own, and I suffered for 
it. With a fearful crash I came on my rump. I did not break the ice, but I certainly left “my 
mark” there in the shape of an asterisk... 


But more often than not, Hirst was working hard, involved both with his mathe- 
matics, and also with his teachers—in particular, Dirichlet and Steiner. There are 
comments on their teaching styles, as well as frequent references to their personal- 
ities and activities. 


20th February 1853: ...I will here record a few peculiarities in Steiner’s lectures. He never 
prepares them beforehand, but follows ideas as they suggest themselves. He thus often stumbles 
or fails to prove what he wishes at the moment, and at every such failure he is sure to make 
some characteristic remark... Dirichlet has also his peculiarities—one is of forgetting time; he 
pulls his watch out, finds it past three, and runs out without even finishing the sentence. 


624 THOMAS ARCHER HIRST [August-September 


3rd April 1853: ...Steiner remains working at home until nearly 4 p.m.; he then dines at 
Schultz’s Wein Keller—-after dinner he takes a long walk until 6 or 7, returns home, works or 
sleeps until 11:30 or 12, and then comes to Schultz’s again to drink a glass of grog and eat his 
supper. Here he shows his testiness most—the waiter he rated soundly for not bringing him his 
usual sort of wine—a person interrupted him whilst speaking, and got from him a stormy 
reprimand. After 1 a.m. we accompanied him home... Yesterday after dining again with 
Knoblauch, I was walking down the Linden and heard an unmistakeable voice (viz. Steiner’s) 
calling my name. I turned with him, and we had a short walk. We spoke much on the old 
question of the relative claims of Analysis and Synthesis in Geometry, and I found him more 
liberal than ever before. As we returned, I asked if for once he would step into my lodgings and 
sit half an hour with me. He did so. And thus for once he paid a visit—a thing he seldom or 
never does... 


24th April 1853: Wednesday evening we spent with Dirichlet... During the evening Prof. 
Hensel (husband to the late Fanny Hensel, another sister of Mendelssohn’s) came. It was 
proposed that we should make the experiment of moving the table (Tisch riicken) which is now 
the subject of fashionable twaddle. Hensel had seen it the evening before and believed in it 
thoroughly. I and Dirichlet were thoroughly sceptical, Mrs D. was indifferent, and Dickinson and 
another gentleman were inclined to be believers. We sat for half an hour, each one placing both 
hands on the table and placing the little finger of his right hand on the little finger of his 
neighbour’s left hand. The table was a pretty stout round one, with one leg rolling on three 
pulleys. It was easily movable by a single person. The experiment was totally unsuccessful... I 
believe the table would have moved this evening had we all been in a sufficient unanimity; one 
thought the table was leaning a little in one direction; directly two or three more are intent upon 
it so moving-—look anxiously in that direction,—-and unconsciously help it. The scientific men in 
Berlin almost all ridicule the idea. It deserves, however, a closer experiment. 


Shortly afterwards, Hirst left Berlin for two months in Paris, where he attended 
the mathematics lectures of Joseph Liouville and Gabriel Lamé, before returning 
to England to take up a schoolteaching appointment at Queenwood College, in 
Hampshire. His time at Queenwood, his marriage, and his subsequent return to 
France, will be described in the next article. 
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Bisectors of Triangles and Tetrahedra 


W. A. Beyer and Blair Swartz 


1. INTRODUCTION. Our principal goal is to discuss and illustrate (in Ficures 2, 
3 and 4) the envelope of the planes that bisect a tetrahedron. To bisect, here, 
means to divide into two pieces of equal volume. The envelope of the hyperplanes 
that bisect a simplex in R” and the envelope of the lines dividing a triangle into 
two pieces of fixed relative size are also considered (the last in Figure 5). These 
problems arose in work in numerical hydrodynamics dealing with approximating, 
within local second-order accuracy, a smooth boundary separating a black and 
white region in the plane, given discretely located gray values associated with a 
blurring of that interface (Swartz [17], exemplified at the end of §8 below). But the 
problems have a much older history in hydrostatics and naval architecture, as they 
are also connected with the orientation and stability of floating bodies. Favard’s 
book [9] contains a satisfactory discussion of envelopes in R? and R?°; there are 
elementary discussions in Courant [7] or Courant and John [8]. 


2. A GENERAL DESCRIPTION. The envelope E of the planes that bisect a 
tetrahedron is homeomorphic to such traditional examples of closed, one-sided 
surfaces as the “Roman surface” of Steiner (see: Francis [10, pp. 83-86], Hilbert 
and Cohn-Vossen [12, pp. 303—4], and Spivak [15, pp. 20-1 and p. 34]) and the 
heptahedron (see Hilbert and Cohn-Vossen [12, pp. 302-3] and Jones [13]). 
Indeed, we shall see that the envelope E also consists of seven pieces—it is like a 
heptahedron whose faces have been pinched to tangency along its edges—but each 
piece is now part of the zero set of its own polynomial in three variables. In 
contrast, the Roman surface is the zero set of a single polynomial in three 
variables that is of total degree four. 


3. THE ENVELOPE OF THE BISECTING LINES OF A TRIANGLE. Extending 
a homework problem in Thomas [18, p. 508, #61] or examples in Lamb [14, p. 232, 
Ex. 3], Greenhill [11, p. 190], or Bouasse [6, §253, p. 382], the following proposition 
summarizes the information about the envelope of the lines bisecting a triangle. 


Proposition. The envelope E of the set of all lines that bisect a given triangle T is a 
simple continuous closed curve lying completely inside T. It consists of three parts—the 
ith part being a segment of a hyperbola whose asymptotes include the two edges 
containing the ith vertex. Each segment joins continuously with its neighbor to form a 
cusp where the two are mutually tangent to an intervening median of T—and each of 
E’s three cusps has the same order of sharpness as the graph of y = |x|'/* near (0, 0). 
The expression b)V. + b\V, + b,V, for the ith part of the envelope (i = 0,1, 2), in 
terms of barycentric coordinates (by, b,, b,) relative to the vertices V,,V,,V, of T, is 
independent of T and is given by the usual requirement b,) + b, + b, = 1 together 
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with the special requirements 


2 
8[[b,=1 and 1/4<b, <1/2 forj #i:; (3.1) 


J 


for which 1/4 <b, < 1 —1/ 2. See Ficure 1. 


Proof: We first recall an instance of an envelope’s hyperbolic segment. The rest 
then follows using properties of nonsingular affine transformations. 

The area of the right triangle T(x), associated with the two coordinate axes and 
the line tangent to the hyperbola H := {(t,1/t), t > O}, is 2, independent of the 
point (x,1/x) (x > 0) of tangency. This is a consequence of the following argu- 
ment. T(x) consists partly of the x X 1/x rectangle whose diagonal is the radius 
vector to the point of tangency. And, as the magnitude of the tangent line’s slope is 
1/x*, the interior of T(x) outside this rectangle consists of two x X 1/x right 
triangles. 

Consider also, now, the isosceles right triangle 7, of area 4 with vertices 
V, = (0,0), V, = (2¥2 ,0), and V, := (0,2V2). The envelope of the hypotenuses of 
those T(x) that are completely inside 7, is a portion of the envelope of the 
bisecting lines for T,. It is also a segment S of H inside 7). The right-most point of 
S is the point P, := (V2,1/¥V2) since the tangent line to H here is also the median 
of T, that passes through (0, V2) and the vertex V,. By symmetry, S’s left-most 
point is P, = (1/¥2,¥2); and the tangent to H there is also the median of T, 
through V,. Since the curvature of H exists on H but vanishes nowhere, S has the 


Figure 1. The bisecting envelope of a triangle: a line tangent to the cusped figure separates a corner 
from its opposite side and divides the triangle into two equal areas. The envelope’s vertices are 
associated with the triangle’s. Note that the normal line changes continuously along the whole of the 
envelope. 
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type of contact with these two tangent medians that is characteristic of a circle’s 
contact with its tangent—this will be relevant to the cusp’s order of sharpness. 

Two of the barycentric coordinates of (x, 1/x) with respect to the three vertices 
V,, V,, and V, of T, are clearly b, = x/(2V2) and b, = 1/(2xy2 ). Consequently, 
on the segment S, b, and b, satisfy 3.1) with i = 0. 

Now: a nonsingular affine map A maps tangent curves onto tangent curves and 
envelopes of curves onto envelopes of their images. Such a map A also maps lines 
onto lines, hyperbolas onto hyperbolas, non-degenerate triangles onto non-degen- 
erate triangles and their medians onto medians, and (because A’s Jacobian 
determinant is independent of location) a line that bisects a triangle onto a line 
that bisects its image. Moreover, since a nonsingular affine map A is the sum of a 
linear map and a constant vector, the barycentric coordinates of a point V with 
respect to three points V,, V,, and V, in general position (non-degenerate convex 
hull) are also the barycentric coordinates of A(V) with respect to A(V,), AVV,), 
A(V,) (which are also in general position). 

With this, proof of the remainder of the Proposition goes as follows. (a) Use 
three separate affine maps of 7, onto an equilateral triangle 7, (along with the 
symmetries of 7,) to show that the complete envelope for 7, satisfies the 
Proposition (in particular, that neighboring hyperbolic segments are tangent at a 
common point on a median and so form the cusp as there characterized). (b) Then 
use a Single affine map taking 7’, onto T. 


4. PART OF THE ENVELOPE OF THE BISECTING HYPERPLANES OF A 
SIMPLEX IN n-DIMENSIONS. For n > 2 dimensions, consider the surface H := 
{x = (x,,...,x,) > 0 such that f(x) =0 with f(x) = I17_,x, — 1} (for three 
dimensions, see, e.g., Appell [1, p. 231, problem 11], Greenhill [11, p. 202], Struik 
[16, p. 73, problem 6], Courant and John [8, problem 8, p. 307]). Note that the jth 
component of the gradient Vf of f on H is (Vf), = 1/x,; hence points X in the 
hyperplane tangent to H at x satisfy X - Vf =n. Thus the altitudes of the simplex 
S(x) bounded by the coordinate hyperplanes and the hyperplane tangent to H at 
x are (nx,)7; so the volume of S(x) is n”/n! independent of x. Restricting x so 
that S(x) C S,:= the simplex of volume 2n”/n! bounded by the coordinate 
hyperplanes and the hyperplane through and normal to (1,..., 1)—that is, restrict- 
ing the vertices of S(x) to lie between those of S, and the origin—one can prove 
(see the Appendix) that an n-dimensional analog of the Proposition in §3 is: 


Proposition. Let n > 2. The envelope of the hyperplanes bisecting an n-dimensional 
simplex with vertices V),...,V,, consists partly of n + 1 hypersurfaces, each of degree 
n. The asymptotic hyperplanes of the ith of these hypersurfaces (some 0 <i <n) are 
the hyperplane extensions of those n faces of the simplex that contain the i th vertex V,,. 


This ith hypersurface’s n + 1 barycentric coordinates bo,...,b, with respect to 
Vo,..-,V,, Satisfy the usual requirement 1” _)b, = 1, together with the relations 
n 
2n" | Ld, =1 and 1/(2n) <b, <1/n forj 41; (4.1) 
j= 
j#i 


for which 1/(2n) < b; < 1 — (27'/"). For n = 3 dimensions, see FiGure 2. 
The n-dimensional simplex has n + 1 vertices. The hyperplanes whose envelope 


is given by (4.1) separate the ith vertex from the n remaining vertices. Thus there 
are n +1 such portions of the complete envelope of the simplex’s bisecting 
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Figure 2. The corresponding bisecting envelope of a tetrahedron: a plane tangent to any of these four 
cupped, tri-edged surfaces separates a vertex from its opposite face and bisects the tetrahedron. It is 
‘now three edges of the envelope—the nearest coplanar hyperbolas—that are associated with the 
tetrahedron’s nearest vertex. There must be more bisecting planes. 


hyperplanes. But there are other parts of the envelope. To obtain the remaining 
portions, the following outline of an algorithm can be carried out. 

Let the n + 1 vertices be divided into two nonempty sets of vertices. Call the 
total number of such divisions I(n) (= (2”*! — 2)/2). For each such division, it is 
necessary to find the corresponding portion of the complete envelope. The 
complete envelope will then consist of /(n) portions. In the next section we discuss 
the case n = 3; it is the only case for which we have a complete discussion. 


5. THE ENVELOPE OF THE BISECTING PLANES OF A TETRAHEDRON. A 
little thought concerning the equilateral tetrahedron (n = 3) in this context sug- 
gests that, corresponding to each non-intersecting pair of its edges, there should be 
a saddle-shaped surface bounded by four curves. One of these curves bounding a 
given saddle coincides with one of the three curves bounding a ‘‘cup” of FiGURE 2 
—each saddle is thereby connected to each of the four cups. The three saddles 
intersect (at the center of symmetry, for example) but are not tangent to each 
other—while they are tangent to the cups. Figures 3—4 illustrate these additional 
steps in the construction of the entire envelope of the bisecting planes of a regular 
tetrahedron. The captions explain the figures. 

Further analysis is required to precisely specify these surfaces. As already 
noted, the affine invariance of the problem means that it suffices to determine the 
barycentric coordinates of the complete envelope FE of the bisecting planes for any 
particular tetrahedron, and we shall fix on the tetrahedron U whose vertices are 
the origin V, := 0 and the three coordinate unit vectors V, =i, V,:=j, and 
V; := k. In other words, U = Oijk, where the overbar means “convex hull of.” In 
this context, if a portion of this envelope is a surface 


x =x(u,v), y=y(u,v), zZ=Z(u,v); 


1993] BISECTORS OF TRIANGLES 629 


4 Aout | 
of CAC . 
a cM EP 
SEO EEO a 
CATA ATAU Tp 
CESS 
ASOA DA TAA 
CAAT IOS 6 WOO 
CEES PEE 
Yen s ser ZEN CERN 
Hy 4 SASS 
Ss 


SN 


Ns 


~ 


(3c) (3d) 


Figure 3. Planes tangent to one of three saddle-shaped surfaces bisect the tetrahedron and separate 
opposing edges. The saddles in each pair are tangent at the ends of the line-segment along which they 


intersect; but at the segment’s midpoint (the tetrahedron’s centroid) they are orthogonal—this last, for 
the regular tetrahedron illustrated. 


then its associated barycentric coordinates (with respect to Vp,..., V3) will be 
b, =X, b,=y, b,=z, and b,=1-—(b,+56,+5);). 


For example: according to (4.1), that portion of E consisting of the envelope of the 
bisecting planes that separate the face ijk from the origin is the surface 


xyz -1/54=0 with1/6<x,y,z<1/3; (5.1) 


and three other portions of E are found be replacing x = b,, y = b,, and z = b, 
here with the three other groups of the four things {b),...,b,} taken three at a 
time. 

So it remains to obtain a more analytic description of, say, the envelope of the 
bisecting planes that separate the edge 0k from the edge ij; as the remaining two 
portions of & can then be found by substituting each of the two remaining groups 
of variables associated with each of the other two pairs of edges. 


For this we recall that the envelope of a two-parameter family of surfaces 


@(x,y,2Z5p,q) =0 


can be constructed by requiring that at the same time g, (:= dg/dp) = 0 and 
g, = 0; thereby determining the envelope in the form (say) of x = x(p,q), y = 
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Figure 4. The three saddles fit inside the four cups, thus completing the bisecting envelope of a 
tetrahedron. Each cup is a cubic. Although each saddle is algebraic, its degree is unknown. If the 
saddles are regarded as not intersecting, then the complete envelope (4c) has the topology of Steiner’s 
Roman surface—i.e., of Hilbert and Cohn-Vossen’s heptahedron. And if part of being an envelope is 
that the normal line change continuously, then the saddles should be regarded as not intersecting 
except at their corners. 


y(p,q), and z = z(p,q). We shall do this in the slightly different situation of the 
three-parameter family of surfaces 


G(x, y,23;),4g,ry'=xpt+yq+zr-1=0 (5.2) 
constrained by a known (albeit yet to be described) function 
F(p,q,r) =0 determining r = r( p,q). (5.3) 


Here one may verify—with a solution of this last in hand—that the functions 
x(p,q), y(p,q) and z(p, q) given by 


x = F,/( pF, + qF, + rF,), (5.4a) 
y = F,/(pF,+qF,+rF,), and (5.4b) 
z= F,/( pF, + qF, + 'F,), (5.4c) 
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indeed solve 0 = g = g, = g, if we define 
2(x,y,Z3p,q) = G(x, y,Z35),49,1r(D,q)) 


as specified in (5.2). The relations (5.4) come about as follows: Eliminating r, 
between the p-derivative of (5.2) and that of (5.3) yields 


x = 2F,/F,; and so, similarly, y = zF,/F,. (5.5) 


Substituting (5.5) into (5.2) yields the third relation in (5.4); the first two then 
follow from (5.5). Favard [9, p. 186] develops relations equivalent to (5.4) in the 
context of (5.2)-(5.3), albeit under the assumption that F is homogeneous in 
variables related to those in (5.2). 

It remains to construct a function F so that (5.3) implies that the envelope of 
(5.2) is the envelope of the bisecting planes separating 0k from ij. For this it is 
more convenient to consider the three coordinate-axis intercepts 


u = 1/p, v= 1/q, w= 1/r (5.6) 


of the plane (5.2) regarded—for each fixed u, v, and w—as a surface P in 
xyz-space. This plane P is to separate the edges mentioned, so we take 


O<u,v<1, and 1<w<o., 


The volume of the original tetrahedron U is 1/6, while the volume of the 
tetrahedron 7 bounded by P and the three coordinates planes is uvw/6. The 
relationship F = 0 (5.3) is to express the bisection requirement that the amount of 
T inside U be 1/12. This quantity, vol(T MU), is cémputed by finding the 
intersection of P with the edge ik and with the edge jk, and subtracting from T’s 
volume the volume of the tetrahedron inside T but outside U using a sixth of the 
appropriate scalar triple product. Applying (5.6), the associated F we used in (5.3) 
was 


F=r'+{[p+q—6+2/(pq)|r+6—pq-2(p+4)/(pa), (5.7) 
1<p,q<™, (5.8) 

and such that 
O<r<tl. (5.9) 


That F here, for p and gq given, is quadratic in r facilitates the determination of 
r(p,q) in (5.3) and the associated surface (5.4). Condition (5.9) is actually a 
condition on p and q. This actually determines only half of the barycentric 
coordinates of this sheet of the complete envelope EF. The remaining half of the 
sheet is given by interchanging b, (=z) and b, (this is most easily seen by 
considering the regular tetrahedron instead of U—-a context in which the saddle- 
shaped character of tHis sheet is also most apparent). Finally, two other sheets are 
similarly associated with the other two pairs of edges. 


6. THE DEGREES OF THE POLYNOMIALS THAT DESCRIBE THE ENVE- 
LOPE. To review: the envelope of the bisecting planes of the tetrahedron has 
seven parts: three saddle-shaped surfaces and four cup-shaped surfaces. Each of 
the three saddle-shaped surfaces is associated with separating two edges of the 
tetrahedron. Each of the four cup-shaped surfaces similarly divides a vertex from a 
face. We have given the polynomials for the cup-shaped surfaces in (5.1) and its 
following three lines—they are of total degree three. 

We now discuss the polynomials that define the three saddle-shaped surfaces. 
These polynomials seem to be complicated and we have not been able to complete 
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this project. The modus operandi to find one of these polynomials is to begin with 
four algebraic equations from §5 in the variables x, y, and z and the parameters 
p, q, and r. Then one uses resultant theory (see Uspensky [19, Chapter XII]) to 
eliminate the parameters one by one and terminate with one algebraic equation in 
the variables x, y, and z. This elimination process was attempted on an 8650 VAX 
computer using the MACSYMA symbolic manipulation system—but it did not 
complete the final elimination in many days of standby computer time. 
More specifically, using (5.3) and (5.4), we obtain: 


(pF, + qF, + rF,)x — F, = 0, (6.1a) 
(pF, + qF,+ rF,)y — F, = 0, (6.1b) 
(pF, + qF,+rF,)z —F,=0, (6.1c) 

F=0. (6.1d) 


We then substitute into the equations (6.1) the expression for F given by (5.7) and 
clear fractions to obtain a set of four polynomial equations in the variables x, y, 
and z and parameters p, g, and r. The parameters are subject to the conditions in 
(5.8) and (5.9). Again, the condition (5.9) is actually a restriction on p and g. The 
conditions on p and g have no relevance to the resultant algorithm, which treats 
the parameters to be eliminated as formal symbols. The restrictions (5.8) and (5.9) 
arise geometrically. However, they are also algebraic restrictions. We know, for 
example, that if r = p = q = 1, then F = F, = F, = F, = 0 and the four equations 
(6.1a—d) are satisfied regardless of the values of x, y, z. Thus all points in R° 
would be on the surface defined by (6.1) for the values r = p = g = 1. 

To eliminate p, g, and r from (6.1) requires three applications of the resultant 
calculation. We had enough machine time to do two of the three. We outline these 
two using the tables below. 

The left side of each equation in (6.1) is multiplied by its denominators’ least 
common multiple, yielding polynomials f(x, p,q,r), f@Cy, p,q, 17), 
fC, p,q,r), and f(p, q, r). Their degrees and number of terms are displayed 
in Table 1. This is the first set of equations to which we apply the resultant 
operation. 


TABLE 1 
Equation Degree # of terms 
fOCx, pg, 17) = 0 6 12 
f(y, p,9g,r) = 0 6 12 
fO(z, p, 4,7) = 0 6 13 
> fOD,an = 0° 4 9 


Eliminating r from the appropriate pairs of equations in Table 1 yielded 
Table 2. 


TABLE 2 
Equation Degree # of terms 
g%x, p,q) = 0 9 21 
gy, p,q) = 0 9 21 
g°Xz, p,q) =0 10 39 
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Eliminating g from the appropriate pairs of equations in Table 2 yielded 
Table 3. 


TABLE 3 
Equation Degree # of terms 
AMY%x, y, p) =0 32 = 500 
h(x, z, p) = 0 27 = 300 


Assuming little or no cancellation in the elimination of p from the two 
equations in Table 3 to obtain a single polynomial equation P(x, y, z) = 0, we 
estimate the degree of P to be 150—and if P has no factors, this seems a 
surprisingly large degree to be associated with such a simple problem. Unfortu- 
nately, the MACSYMA computation attempting this elimination was terminated 
(without output) after many days of calculation. 

On the other hand, C. de Boor felt he could demonstrate a lower bound on the 
saddle’s degree [4]. For this he took N = 45 points on the horizontal saddle 
associated with the tetrahedron having as vertices the points (+(@G,3),3) and 
(+(3, —3), —3)—the resulting saddle has its corners at +i and +j. He applied a 
numerical algorithm, based on [5], to construct a space F of polynomials in three 
variables of smallest possible degree that allowed unique interpolation to arbitrary 
values at the N points and at the additional point (1, —1,1). Although @ turned 
out to contain all quartics, he found that nontrivial members of @ vanishing at the 
first N points had degree higher than four. This convinced us that the surface 
defined by (6.1) and (5.7) is of degree higher than four. 


7. GRAPHS OF THE ENVELOPE OF THE BISECTING PLANES OF A TETRA- 
HEDRON. For those who care about graphics as well as graphs: The principal 
software invoked in the computer construction of Figures 2—4 was the PLTN2 
“super-package,”’ developed by J. M. Hyman and R. Dougherty to ease (interac- 
tively) the application of both M. Prueitt’s GRAFIC package (used here) and the 
NCAR (National Center for Atmospheric Research) package. All people men- 
tioned in this connection were at the Los Alamos National Laboratory. GRAFIC 
plots a sequence of surfaces in three dimensions, each surface being prescribed by 
a “logically rectangular” set of points (X, pe =1 lying in the surface. The surface 
is then approximated as follows. The smallest logical sub-squares are each edged 
by line segments; and (for the purpose of computing normals) the surface spanning 
these four segments’ is considered to be the two-parameter bilinear average 
interpolating the four vertices (which is one of the doubly ruled surfaces containing 
these vertices). GRAFIC removes points and line segments that are hidden from 
the viewer by other surfaces, and both shading and color are options. Indeed, the 
most illuminating version of these figures includes both. 

For the figures, the edges of the tetrahedron and the axes through its centroid 
are specified to be slender tubes (with polygonal sections), not lines. The require- 
ment of logically rectangular data meant that each cup-shaped segment of the 
envelope is graphed as the union of three four-edged sections. Along the two 
(adjoining) outer edges of one of these sections, the mesh is relatively uniform 
(each such edge is also an edge of one of the saddles). But along the other two 
edges (where these sections join each other) the mesh diminishes like r*/* towards 
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the center of the cup-shaped segment. This was done to improve the visual 
smoothness (associated with the fact that each “cup” is, in fact, analytic), and 
roughly approximates C. de Boor’s suggestion to use a coordinate system associ- 
ated with a conformal map of a 90° angle onto a 120° angle for this purpose. 


8. DIVIDING TRIANGLES INTO REGIONS OF UNEQUAL SIZE. The original 
problem in computational hydrodynamics requires information about the lines (or 
planes) that divide a region into two subsets of prescribed (and not necessarily 
equal) relative size. 

Towards this end we consider for given 6, 0 < 0 < 1/2, the envelope E, of the 
lines that separate a triangle T into polygons of relative area 6 and 1 — @. As in 
§3, this problem and its solution are invariant under nonsingular affine maps. So, 
as in §3, it is relevant that there are two equilateral hyperbolas of the form 
xy = constant such that some of the tangent lines of each cut the right triangle 0ij 
into two such regions. More specifically, these hyperbolas satisfy 


either 4xy=0 or 4xy=1- 4, 


depending on whether the 6-fraction lies on the 0-side of the tangent line or on its 
other side. It follows that: for given 0,0 < @< 1/2, the envelope E, of the lines 
that divide a triangle T into portions of relative size 8 and 1 — @ consists of three 
pairs of hyperbolic segments. One pair of the three pairs is associated with each vertex 
V, of T by having as asymptotes the two edges of T that contain V;; and the 


barycentric coordinates of this pair satisfy (with subscripts taken mod3) b,_, + 
b, + b;,, = 1, together with 


(a) 4b,_,5,,,=9 or (b) 4b,;_,);,, =1-—- 6. (8.1a) 


At the ends of each hyperbolic segment (say of type (a) for some i = i,) one 
switches to another of the other type (i.e., type (b) for some i # i,). Hence, the 
point (b,_,,b;,b;,,) at the V,,,-end of the type (a)-segment, being also at the 
V,,,-end of a type (b)-segment, satisfies 


4b,_,6,., = 8, 4b.b,,,=1- 90, and b,_,+6,+6,,,=1. (8.1b) 
Consequently: The endpoints of the hyperbolic segments (8.1a) or (8.1b) also satisfy 
b;,,= 1/2 (so that, also, b;,, + b; = 1/2 there). (8.1c) 


Checking that, indeed, the two types of hyperbolic segments are tangent at such 
common points, we see that these cusps (where the interlaced hyperbolic segments 
now join together with continuously turning tangent line) trace (for 0 < 6 < 1/2) 
the three open line-segments whose, closures connect the three midpoints of the 
edges of 7. And the. cusps perform this covering in one-to-one fashion. 

This is all illustrated in Figure 5. There it is seen, as 8 moves from 1/2 to 0, 
that the original three-cusped envelope E, ,, (FicurE 1) doubles its length for 6 
just below 1/2, becomes a trefoil containing T’s centroid for 06 = 4/9; and that E, 
approaches the boundary of the original triangle T as 6 approaches zero. Except 
for 6 = 1/2 or 4/9, the hyperbolas composing E, cross (transversely) thrice (each 
time on one of T’s medians), so that it is seen that there is no 0, 0 < 6 < 1/2, 
such that E, bounds only a convex figure. 

Let us now illustrate how such envelopes could be used to locally approximate a 
smooth boundary between a plane region D, colored white, and its black-colored 
complement, when given as data only the average color [7;ypdA/{7,dA of each 
triangle T in a tessellation of the plane into small, equilateral triangles. (y,, here, 
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(Sa) 


Figure 5. The envelopes of the lines that divide a triangle into two pieces having relative areas 6 and 
1 — 6. Above: clockwise from the top, 6 = 1/2, 0.485, 0.47, 4/9, 0.4, and 1/3. Below: from the left, 
6 = 1/4, 0.15, and 0.05. As the text explains, all curves are segments of hyperbolas whose asymptotes 
are the sides of the triangle, and all cusps lie on one of the lines joining the midpoints of the original 
triangle’s sides. 


(Sb) 
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is the characteristic function of D, and “‘small’’ compares the diameter of T to the 
curvature of the interface). The average color of most triangles will be either black 
(= 0) or white (= 1), but those through which the boundary dD passes will be 
colored intermediate values of gray. Since the triangles are small we could hope 
that a locally liner approximation of 0D would be second-order accurate (i.e., have 
an error that goes to zero like the area, not just the diameter, of each triangle). 
And, indeed, this will be so if (a) the algorithm reproduces an arbitrary linear 
boundary exactly, and (b) the construction is both local (i.e., determined by nearby 
data and used only nearby) and stable. 

For example: Suppose, in Figure 5, that the triangle at 6 o’clock had average 
color 4/9 and the triangle at 10 o’clock had average color 2/3, and that we ignore 
the remaining triangles. Then there exist only a finite number of lines that divide 
each triangle into two pieces having areas of appropriate relative size—namely, 
the four common tangents to the indicated envelopes. But, only two of these four 
lines will do as borders of a half-space to approximate 0D; since either of the other 
two would have to be black on one side in order to color one triangle appropri- 
ately, but be white instead on that same side to simultaneously color the other 
triangle appropriately. Like the sign of a square root, the selection between the 
two remaining candidates for a linear boundary must be made using an additional 
criterion—for example, the location of some completely white triangle would 
usually suffice. 

Further details will be found in [17], including discussion of geometric circum- 
stances leading to second-order accuracy (and others, to lower-order accuracy in 
spite of reproducing linear boundaries). That three gray average colors can 
determine approximating planes in three dimensions is also noted there, along 
with connections of these problems with polynomial spline functions and with 
apparently nontrivial generalizations of the “ham-sandwich”’ problem of Steinhaus. 


9. REMARKS. It is worth noting the connection of our envelopes with “surfaces 
of flotation” (using the terminology of hydrostatics and of naval architecture). 
Thus, suppose one is given a body K in Euclidean n-space along with a prescribed 
6 in (0,1). Let E, be the envelope of those (hyper)planes dividing K in two parts 
having relative volume 6 and 1 — @. (If the body K had specific gravity @ (or 
1 — 6) and were floating in some orientation, then its sea-level (hyper)plane 
section would be tangent to E, independent of that orientation.) In this regard, 
then, White and John [20] claim to be the first to recognize (1871) that for two 
dimensions the complete curve of flotation of an object can contain cusps— 
indeed, our Ficure 5 for 0 = 1/4 1s qualitatively described there quite accurately 
[20, p. 93]. In fact, the curves of flotation (for triangles of unspecified specific 
gravity) in Figures 3, 4 and 5 of their Plate V (kindly sent us by the Secretary of 
the Royal Institution of Naval Architects) are in harmony with the envelopes E, in 
our Ficure 5 for 0 = 4/9, 1/4, and 1/3, respectively. Moreover, the curve of 
flotation for actual vessels—see, e.g., [20, Figure 1 of Plave IV] or the reproduc- 
tion (from another paper by White) in Greenhill [11, p. 160]—-have many charac- 
teristics in common with the curves in our FiGure 5. 

In the expanded version of this paper (the report [2]) we included an attempt to 
use a classic result concerning surfaces of flotation to help demonstrate that the 
topology of bisecting hypersurfaces E,,. in n-space is that of the projective 
hyperplane P”~', but that the topology E, for 6 # 1/2 is, instead, that of the 
surface S”~! of the unit ball. The attempt fails—but it is both amusing and 
instructive. 
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It is possible for the envelope of the bisecting hyperplanes of a region to be a 
single point—consider a rectangle or a circular disc. When it exists we have called 
such a point a halfway point (all this in another report [3]). There we extend the 
concept—i.e., of the notion of the median of a distribution—as follows. Let p be a 
nonnegative function on R” whose integral R” is finite. Then a point h# in R” js 
called the halfway point for p if any (n — 1)-dimensional hyperplane H containing 
h has half the mass of p on each side: Le. 


J e(x) dx = J p(x) dx, 


where H* and H™ denote the two half spaces on either side of H. In [3] we 
consider some characteristics of functions p that have halfway points. 
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APPENDIX. FINISHING THE PROOF OF THE PROPOSITION IN §4. Let e j be 
the jth coordinate unit vector in the canonical basis for R”. Note that a point 
€ = L?_,&e,; in R” then has barycentric coordinates (bo, b,,..., b,,) relative to the 
n + 1 vectors e,,...,¢,, and the origin 0 =: e, that are given by b, = &,,...,5, = 
€,, and by := 1 — X7_,b,; for then € = X17_)b,e; with L7_ )b, = 1. With this the 
tangent hyperplane at a point b = X_,b,e, in the hypersurface G of points b > 0 
satisfying g(b) = I1?_,b; — 1/(@n”) = 0 (see (4.1) with i = 0), namely the hyper- 
plane H, of points B = ©?_, Bye, satisfying L7_, (B,/b;) =n, defines (with the n 
coordinate hyperplanes) a simplex ©, whose volume is half that of the standard 
simplex © := €)é,... €, (the overbar here means “convex hull of’). (To see that g 
is indeed an appropriately scaled version of the function f in the discussion above 
the Proposition in §4—and thus that the equation in (4.1) is correct—note that the 
simplex S, being bisected there is the convex hull of the origin 0 =: wy and the n 
vectors w, = n2'/"e,, 1 <j <n; so that for x there to satisfy both x = L7_,x,e, 
and its barycentric expression x = L/_,b,w; relative to wo,...,w, it suffices that 
g(b) = 0 and LF_ yb, = 1.) 

The upper bounds 1/n in the inequalities 1/(2n) < b; < 1/n in (4.1) are simply 
the additional constraints on b in G appropriate for H, to bisect 1; i.e., for L, to 
be completely contained in ); i.e., for each of H,,’s n coordinate-axis intercepts 
nb, to lie between 0 and 1. The lower bounds 1/(2n) on all but by come about as 
follows: Fix k, 1<k <n. Then b, satisfying g(b) =0 attains its minimum 
relevant value b, = 1/(2n) when the b,, 1 <j <n but j #k, all take on their 
maximum relevant-values 1/n—and note, for this b in G, that b, is also 1/(2n). 

Finally, we now shall see that b, can be no smaller than 1/(2n) for all b in the 
“bisecting subset” G, ,. C G (.e., when X, C X). Equivalently, we shall show that 
L(b) = X7_,b, is maximized on G, ,, when g(b) = 0 (of course) and all but one of 
b,,...,b, are 1/n. First: as VL = X7_,e,;, L has only one extreme value on the 
hypersurface G—namely, when all nb, are 1/(n2'””)—and it is a minimum 
(namely, 2~'/”). Consequently, maxima for L over G, y2 occur on its boundary. 
But 5b lies in the boundary of G, ,, if and only if at least one 5; is 1/n, 1.e., at least 
one of the intercepts nb, of the corresponding bisecting hyperplane H,, is 1. (For if 
all nb, are strictly between 0 and 1—and also restricted by g(b)=0 of 
course—then 5 is in the interior of G,,. in that the intercepts of H, then have 
n — 1 independent degrees of freedom locally.) So, suppose b, = 1/n. Then we 
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wish to maximize L,,_ (b,,...,6,_,) = LZ/b, subject to g,_(b,,...,b,_-.) = 


T17=b,; — 1/(@n"~*) = 0. The one interior extremum is again a minimum (with all 
n — 1 variables now 1/(n2'““"~)); the boundary containing the maxima again 
consists of vectors with at least one more intercept of H, being 1, 1.e., with one 
more coordinate, say b,_,, being fixed at 1/n. And so forth, down to the point 
when L,(b,,b,) = b, + b, is to be maximized subject to g,(b,, b,) = b,b, — 
1/(2n*) = 0. Hence (say) b, = 1/n and b, = 1/(2n). And we have shown what 
we desired—namely, that the maxima of L over G,,, (and hence the minima 
1/(2n) of by) occur with b, = 1/(2n) for some 1 <j <n, and all the rest at 1/n. 

Geometrically, these extrema occur when the hyperplane H, that bisects Li also 
contains one of the (mn — 2)-dimensional “edges” of the “face” e,e,...é, of X that 
H, is separating from the vertex e,—each of these n “edges” consists of the 
convex hull of all but one of the basis vectors e,,...,e,. 

The minimization of b, above is a simple example of solving a problem in 
geometric programming, that is, finding the extreme values of a generalized 
polynomial in n variables subject to generalized polynomial constraints. A general- 
ized polynomial, here, is a linear combination of products of (not necessarily 
integral) powers of the variables. 


Added in Proof. Our text associates the idea of the heptahedron with the names of 
Hilbert and Cohn-Vossen. However, Francois Apréy’s recent and handsome book, 
Models of the Real Projective Plane (Friedr. Vielag & Sohn, Braunschweig, 1987), 
calls it (p. 17) the Reinhardt heptahedron. An appropriate reference is: Curt 
Reinhardt, Zu Modbius’ Polyedertheorie, Berichte tiber die Verhandlungen der 
KG6niglichen Sachsischen Gesellschaft der Wissenschaften zu Leipzig, Mathema- 
tisch-physikalische Classe, vol. 37, 1885, pp. 106-125. Reinhardt also deposited a 
cardboard model in the Mathematical Institute of the University at Leipzig. We do 
not know if it survived. 
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Postscript. We’ve just been pointed (by L. M. Kelly) to G. Gunther and J. B. 
Wilker’s paper The bisectrix of a tetrahedron, Mathematika 39 (1992), 93-103. 
Although figures and any historical perspective are lacking there, we do want to 
otherwise note a number of similar ideas. 


640 


The name of Professor FELrx KLEIN, 
of the University of Gottingen, to- 
gether with those of six other German 
educators, has been cancelled from the 
roll of honorary members of the Na- 
tional Education Association in re- 


sponse to a persistent demand from 
active members of the association, from 
members of the Council of National 
Defense, and from others. 


* 


‘—American Mathematical Monthly 
25, (1918) p. 331 
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More on Rectangles Tiled by Rectangles 


D. G. Mead and S. K. Stein 


The first theorem on a rectangle tiled by rectangles was proved by Dehn in 1903 


[3]: 


Theorem 1. Let R be a rectangle that has at least one edge of rational length. Let R 
be tiled by smaller rectangles each of which has the property that the ratio of its length 
to its width is rational. Then all the edges of R and of the tiling rectangles have 
rational lengths. 


In 1940 Brooks, et al. [2] obtained this result by associating an electrical network 
consisting of currents, voltages, and resistances with the tiling and using well 
known properties of such networks. We will modify their approach slightly by 
adding a battery to each edge to obtain the following theorems. 


Theorem 2. Let R be a rectangle that has at least one edge of rational length. Let R 
be tiled by smaller rectangles each of which has a rational perimeter. Then all the 
edges of R and of the tiling rectangles have rational lengths. 


If the assumption on R in Theorem 2 is replaced by “R has a rational 
perimeter,” the result is false, as is shown by tiling the 2V2 by 4 — 2y2 rectangle 
by four v2 by 2 —- v2 rectangles. 


Theorem 3. Let R be a rectangle that has at least one edge of rational length. Let R 
be tiled by smaller rectangles whose length and width differ by a rational number. 
Then all the edges of R and of the tiling rectangles have rational lengths. 


Note that either Theorem 1 or Theorem 3 implies that in a tiling of a rectangle 
with a rational edge by squares, all the dimensions of the rectangles are rational. 


Theorem 4. Let R be a rectangle whose width and length differ by a rational number. 
Let R be tiled by smaller rectangles each of which has a rational perimeter. Then all 
the edges of R and of the tiling rectangles have rational lengths. 


1. THE METHOD. Let G be a connected linear graph with m vertices and n 
edges, €,,€>,...,@,, Such that each edge is incident to two distinct vertices. There 
may be more than one edge incident to the same vertices. If edge e, is incident to 
the vertices A and B we orient e, by selecting one of the orientations AB or BA. 
(We may think of the orientation AB as an arrow from A to B and the algebraic 
boundary of e, as B — A.). 

Let C, consist of the formal sums 17_,x,;e;, where x, is real. Such a sum is 


171? 


shorthand for a function h from the set of edges to the real numbers, where 
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h(e;) = x;. C, is a vector space of dimension n with real coefficients. If the vertices 
are V,,V,,...,V,, let Cy consist of the formal sums 7 ,y,V,; where the y, are 
real. This sum stands for a function p from the set of vertices to the real numbers, 
where p(V,) = y,. Define 0: C, — Cp by setting d(e,;) = B, — A; if e; is oriented 
from A, to B;, and extending by linearity. Note that p(de,) = 


p(B; _ A;) = p(B;) _ p(A,). 


Let r,,r,,...,7, be nonnegative real numbers associated with the edges 
€1,€5,---,&, respectively such that the set of edges associated with the r,’s that are 
0 contains no closed circuit. Let w,,w,,...,w, be m real numbers. 


Consider the following two equations for the unknown functions h € C, and 
p = C>: 


I. | y mene; = 0 
i=1 


Il. p(de;) = w; — r;h(e;). 


(In terms of electrical networks, h(e;) is the current in e,, r; is the resistance in e,, 
w; is the electromagnetic force of a battery attached to e; and p(V) is the potential 
at V. Equation I asserts that the total current entering a vertex is 0. Equation II 
relates the voltage drop over an edge to the current, resistance and the strength of 
the battery at that edge.) 

In ({1], 162-171) it is shown that these simultaneous equations have a unique 
solution for h and a unique, up to an additive constant, solution for p. Actually, in 
[1] it is assumed that all r, are positive. However, the key argument, which appears 
in the footnote on p. 171, goes through with our weaker assumptions, as long as 
the spanning tree used in the proof is chosen to contain the edges for which r, = 0. 
(There is a misprint in the footnote: the final > should be replaced by >.) 
Moreover, the formulas for the values of h and p obtained there show that if r,; 
and w,, 1 <i <n, are all rational, then so are the values of A and therefore of 
p(de;). The same conclusion holds if “nonnegative” is replaced by “nonpositive” in 
[1]. 

As in [2] associate a linear graph with a tiling of a rectangle R by rectangles 
R,, R,,...,R,_,. (For convenience, denote R also by R,.) To do this, introduce 
an xy coordinate system such that R,, is in the first quadrant, its edges are parallel 
to the axes, and the origin is at a corner of R. Each rectangle R,;, 1 <i <n, has 
edges parallel to the x-axis (the “horizontal edges’’) of length h; and edges parallel 
to the y-axis (the “vertical edges”) of length v,. 

Let S be the union of all the horizontal edges of the n rectangles. The 
midpoints of the connected components of S will be the vertices of a linear graph 
G. For each rectangle R;, 1 <i<n-— 1 introduce an edge oriented from the 
component containing its lower edge to the component containing its upper edge. 
For R,, = R, introduce an edge oriented from its upper edge down to its lower 
edge. At a vertex V define p(V) to be the y-coordinate of V. Thus for 1 <i < 
n — 1, p(de;) = v, and p(de,) = —v,. Also define h(e,;) to be h;, 1 <i <n. 

The definitions of r; and w, will depend on the particular theorem to be proved. 


2. PROOFS OF THE THEOREMS. The proof of Theorem 1, as given in [2], goes 
as follows. First place R in such a way that its vertical length v, is rational. Define 
r, tobe —v,/h;,1 <i <n — 1. Define r, to be 0. Define w; tobe 0,1 <i<n-1 
and w,, to be —v,,. Checking that I and II are satisfied is straightforward. Thus, all 
h, and v,, 1 <i <n, are rational. 
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To prove Theorem 2 let w, =h; + v;,,1<i<n-—1,andw, = —v,. Let r, = 1, 
1<i<n-—tl,andr, =0. 

To prove Theorem 3 first place the rational edge of R along the y-axis. For 
1<i<n-—1 let r, = —1 and w, =v, —h,. Let r, = 0 and w, = —»v,. 

To prove Theorem 4 let r, = 1 and w, =h,; + v; forl1 <i<n-—1.Letr,=1 
and w, =h, — v,. 

These theorems could be generalized by assuming, that for each R,;, 1 <i< 
n — 1, there is a positive (negative) rational number r,; such that v,+ 7,h, is 


rational and that v, is rational. The proof is similar. 


3. ANOTHER APPROACH. In the proof in [1] the values of p, h, and w are never 
multiplied by each other. Thus we may take their values in a vector space over the 
field generated by r,,...,7,, in particular in the abelian group R/Q, under 
addition. In the proofs w, is now an element of R/Q, the zero element. Equations 
I and II, which refer to elements in R/Q, hold and again the solution is unique, 
namely p and / must both be the constant function with value 0 € R/Q. 

For a different type of problem concerning tiling a rectangle by rectangles see 


[4]. 
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The Birthday Problem 


In an article in the January, 1992, issue of the MONTHLY, Joag-Dev and 
Proschan present an elementary example of the use of majorization in probability. 
This example considers the Birthday Problem where different dates have different 
probabilities. 

Another frequently taught problem in probability is the Coupon Collector’s 
Problem, and this problem provides a similar elementary example of the use of 
majorization. Suppose that n objects are picked repeatedly and independently with 
the probability that object i is picked at on a given try is p, (where p, +... +p, = D. 


Let p = (p;,..., p,) and let J, be the Coupon Collector’s Time, i.e. the earliest time 
where all n objects have been picked at least once. A reasonable exercise for 
someone who has read the article of Joag-Dev and Proschan is to show that 


P(Tayn,...,tyny $0) 2 PC, = t) 


for any p and to show that P(7, < ft) is a Schur-concave function of p. 
~——Martin V. Hildebrand 
Department of Mathematics 
The University of Michigan 
Ann Arbor, MI 48109-1003 
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Ramanujan—For Lowbrows 


Bruce C. Berndt and S. Bhargava 


“No, Inspector,” he said. “It is not at all like that, I am assuring you. You see, for a person of my 
sort—and I admit that we are a rare breed—numbers are so much in our minds there is hardly any 
question of writing them down, let alone adding one to another.” ... 

“Let me give you one instance,” he said. ‘Before I was beginning work just now, I was taking a 
short stroll, and I happened to see a handcartwalla. Now, being the sort of chap I am, I of course 
notice the number burned on the side of the cart: seventeen-twenty-nine. Now, does that mean 
anything to you yourself ?”’ 

“It is the number on the cart,’ Ghote answered guardedly. ‘“‘ By law it must be there.” 

Raghu Barde smiled his warm smile again. 

“Ah, yes, the police view. But what do you think those figures meant to me? You would never 
guess. But the moment I was seeing them I said: Aha, the smallest number expressible as a sum of 
two cubes in two different ways. And, you know, if ever I am getting to marry, I suppose I will want 
a wife whose birth date comes to some number pleasing to me like that.” 

“TI see,” Ghote said. 

And, although the mumbo jumbo about cubes and expressible meant nothing to him, and he 
could not help thinking that to choose a wife by number would be a much riskier proceeding than to 
let the astrologers choose one for you, he did dimly see what a different sort of life Raghu Barde 
lived from that of the common number-unencumbered man. 


H. R. F. Keating 
Dead on Time 


1. INTRODUCTION. To celebrate the centenary of Ramanujan’s birth, in June, 
1987, an international conference was held at The University of Illinois at 
Urbana-Champaign [1]. Numerous roads through varied scenery brought re- 
searchers from Ramanujan’s papers, problems, letters, notebooks, and unpub- 
lished manuscripts to a panoply of areas of contemporary research, including 
partitions, mock theta-functions, statistical mechanics, Lie algebras, probabilistic 
number theory, modular forms, elliptic functions, complex multiplication, hyperge- 
ometric series, g-Series, aSymptotic expansions, and beta integrals. Very few 
mathematicians have ever had such a broad impact on mathematical research. 
Although many results presented at the conference could be understood and 
appreciated by mathematicians outside these areas of research, this was a confer- 
ence for highbrows. 

Many of Ramanujan’s beautiful discoveries, however, are easily understood, are 
elementary, and appeal to a wide variety of tastes. Thus, this paper is written for 
lowbrows. Only elementary algebra is needed to prove the lion’s share of theorems 
reported here. Most are found in the unorganized portion of Ramanujan’s second 
notebook, his third notebook, and problems that he posed for readers of the 
Journal of the Indian Mathematical Society. The results we describe fall under the 
headings of elementary algebra, equal sums of powers, and elementary number 
theory. 


644 RAMANUJAN—FOR LOWBROWS [August-September 


We begin our expedition in a taxi-cab as we recount G. H. Hardy’s riding in 
taxi-cab no. 1729 to visit Ramanujan while lying ill in Putney. Some historical 
remarks are offered on the two representations 17 + 12° = 9° + 10° of 1729. This 
leads us to Euler’s solution, rediscovered by Ramanujan in a simpler form, of the 
diophantine equation A? + B* = C? + D?, 

We turn from equal sums of third powers to equal sums of fourth powers and 
ask “Did Ramanujan ever read Mathematical Magazine?’ No, we are not speak- 
ing of the journal, Mathematics Magazine, published by the MAA, with the first 
issue appearing under a slightly different title in 1926, six years after Ramanujan’s 
death. Some historical remarks will be made about Mathematical Magazine. 

We next temporarily stop our journey to view what the authors consider to be 
one of the most captivating, enthralling finite identities in all of mathematics. Is 
this marvelous identity simply an accident on the road to sums of powers? Or are 
we at the base of the Himalayas—facing away from the mountains? 

We next encounter three types of systems of equations. The first system leads us 
to sequences that decrease for a while, then increase for a while, etc. We must 
have roamed to a college campus, for these sequences involve radicals, infinitely 
many of them. Like most radicals, these have interesting properties. The second 
system leads us to a visit with S. Ramanujam. No, that is not a misprint! Is he 
really Ramanujan, or is he someone else? Our third system was solved beautifully 
by Ramanujan in his third published paper, but he did not realize that J. J. 
Sylvester had solved this system in 1851, nor was Ramanujan aware of the 
implications of his work. We provide a sketch of Ramanujan’s clever proof. 

Proceeding from a sketch to a complete landscape, we provide proofs of some 
interesting properties of roots of cubic polynomials that Ramanujan discovered. As 
applications, we offer two curious trigonometric identities. 

For our last proof, we establish sharp bounds for a sum giving the largest power 
of a prime dividing n!. 

We conclude our paper with some approximations to 77. 

Several references will be made to Ramanujan’s notebooks [26], published in 
two volumes. The second volume contains the second and third notebooks, and all 
page numbers in this paper refer to the pagination in this volume. 


2. SUMS OF POWERS. Many readers are familiar with the famous taxi-cab story 
immortalized by Hardy [27, p. xxxv]. “I remember once going to see him when he 
was lying ill at Putney. I had ridden in taxi-cab no. 1729, and remarked that the 
number seemed to me rather a dull one, and that I hoped it was not an 
unfavourable omen. ‘No,’ he replied, ‘it is a very interesting number; it is the 
smallest number expressible as a sum of two cubes in two different ways.’ ” (It is 
clear that the author of the opening passage about a handcart with 1729 imprinted 
on its side was acquainted with this delightful incident in the life of Ramanujan 
and Hardy. A handcartwalla is a person who pulls a two-wheeled handcart, 
normally carrying one or two people, and is no longer a common sight in present 
day India. The suffix “walla” comes from Hindi.) In fact, Ramanujan had previ- 
ously recorded these two representations for 1729, 1° + 12° and 9° + 10%, on page 
225 of his second notebook [26]. However, this example appears to have been first 
noticed by B. Frénicle de Bessy in 1657. Frénicle and J. Wallis each found 
additional examples for two equal sums of two cubes. A bitter argument ensued 
with each accusing the other of using trivial methods. Since P. Fermat also 
frequently was feuding with these two men, letters detailing their acrimony can be 
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OXFORD EDITION 


THE 


Poetical orks 


oF 


WILLIAM WORDSWORTH 


WITR INTRODUCTIONS ANO NOTES 


edtrto bv 
THOMAS HUTCHINSON, M.A. 


Rondon 
HENRY FROWDE 
OSFORD UNIVERSITY PRESS WAKEHOUSE 
AMEN CORNER, EC 


The frontispiece of a volume of Wordsworth’s poetry. The volume was awarded to the young 
Ramanujan for his “outstanding work in Maths.” Such prizes for mathematical contests were common 
in Ramanujan’s hometown, Kumbakonan, and throughout India of the period. 


found in Fermat’s Oeuvres [11, pp. 419-420; 427-457] and E. T. Bell’s book [2, 
Chapter 12], as well as in L. E. Dickson’s History [8, p. 552]. In 1898, C. Moreau 
[18] found the ten solutions of A* + B? = C*? + D? with the sums less than 
100,000. After 1729, the next largest sum is 4104 = 2° + 16° = 9° + 15°. 

From another viewpoint, Ramanujan provided Hardy with solutions to the 
classical diophantine equation 


AB + B32 4+ C3 = D3, (2.1) 


L. Euler [10] completely solved (2.1) for positive or negative rational solutions. At 
three places in his notebooks, Ramanujan addresses the problem of finding 
solutions of (2.1). In Entry 20Gii) of Chapter 18 and on page 266 in the unorga- 
nized portion of his second notebook, Ramanujan provides parametric solutions to 
(2.1), but they are not as general as Euler’s. But near the end of his third notebook 
[26, p. 387], Ramanujan offers a family of solutions equivalent to Euler’s general 
solution. Both Hardy [13, p. 11] and G. N. Watson [30] discussed one of Ramanu- 
jan’s less general solutions to (2.1). They had no knowledge of Ramanujan’s 
general solution, because they did not have access to the third notebook. We quote 
Ramanyjan’s theorem. 


Theorem. [f 
a* + aB + B* = 3day’, 
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then 
(a+ My) + (AB+y)? =(Aat+y)? + (B+ %y). (2.2) 


As an example, we recover the two pairs of aforementioned taxi-cab cubes by 
putting (a, B, y, A) = G,0, 1,3) in (2.2). 

Although several formulations equivalent to Euler’s general solution have been 
discovered, Ramanujan’s formulation (2.2) appears to be the simplest of all. The 
problem of completely characterizing all positive integral solutions of (2.1) is 
unsolved. 

On the other hand, Euler conjectured that there were no positive integral 
solutions to 

A* + B44 C* = D*. 
It was not until 1988 that Euler’s conjecture was shown to be false by N. D. Elkies 
[9], who found an infinite class of solutions. 

Ramanujan derived several theorems providing infinite families of solutions for 
equal sums of powers. For example, toward the end of this third notebook [26, 
p. 384], he writes two parametric solutions for representing a fourth power as a 
sum of five fourth powers. 


Theorem. If s, t, m, and n are arbitrary, then 
(85? + 40st — 2447)° + (6s? — 44st — 1822)" + (145? — 4st — 4227)° 
+ (9s? + 2742)" + (452 + 1207)" = (155? + 4527)" (2.3) 
and 
(4m? — 12n?)* + (3m? + 9n?)* + (2m? — 12mn — 6n?)" 
+(4m? + 12n?)* + (2m? + 12mn — 6n?)* = (5m? + 15n?)". (2.4) 


Ramanujan recorded several examples. For instance, if we set s = 1 and t = 0 

in (2.3), we find that 
44+ 6% + 84+ 9% + 14% = 15%. 

Formula (2.3) is due to C. B. Haldeman [12, pp. 289-290] in 1904. Uncannily, 
Ramanujan used the same notation and recorded the terms in the same order as 
Haldeman! Likewise, (2.4) was established by Haldeman [12, p. 289] and slightly 
later by A. Martin [15, pp. 325-326, 331]. Ramanujan does not use Haldeman’s 
notation in (2.4) but does employ Martin’s notation! 

Ramanyjan recorded his results in notebooks from about 1903 until he departed 
for England in 1914. The 16 chapters in the first notebook and the 21 chapters in 
the second evince a progressive maturation from more elementary mathematics to 
much deeper results. The third notebook, however, contains both very elementary 
results as well as advanced results. While the latter theorems may have been 
recorded in Cambridge, the former results were probably recorded early in the 
period 1903-1914. Since in India Ramanujan did not have access to even the 
primary mathematical journals of his day, it is extremely unlikely that he could 
have seen the obscure journal, Mathematical Magazine, in which Martin and 
Haldeman published their results. Thus, the notation in (2.3) and (2.4) being 
identical with that of Haldeman and Martin, respectively, must be coincidental. 

Mathematical Magazine was founded and edited by Martin and was devoted to 
“elementary mathematics.” Issues of the first volume were published quarterly in 
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1882-1884 at a cost of 50 cents per issue or one dollar per year. The second and 
last volume of 12 issues was published over the years 1890-1904, with the last four 
issues appearing in January, 1895; January, 1896; December, 1898; and January, 
1904. The last issue contains four papers, three by Martin and one by Haldeman. 
In the penultimate issue, under the heading “Editorial Items,” we learn that 
“Since No, 10 of the Magazine was published, three able contributors have 
‘crossed over’ and ‘passed beyond the confines of earth.’”’ It is likely that an even 
greater number “crossed over” between the 11th and 12th issues. Possibly due to 
complaints registered by readers disgruntled over the irregularity at which issues 
appeared, the price per issue had dropped to 30 cents. 

Toward the end of the third notebook [26, p. 386], Ramanujan records one of 
the most fascinating identities we have ever seen. 


Theorem. Let a, b, c, and d denote any numbers such that ad = bc. Then 
64{(a+b+c)t+(b+ct+d)°—-(c+d+a)’-(d+a+b)° 
+(a— d)° —(b- c)°} 
x{(atb+c)"+(bt+ct+d)" —-(ct+d+a)-(dta+b)" 
+(a— d)"° —(b- c)""} 
=45{(a+b+c) +(b+c+d)—(c+d+a)—(dt+a+b) 
+(a —d)*-(b—c)*}". (2.5) 


The hypothesis ad = bc was omitted by Ramanujan, although it does appear as 
a hypothesis for some related results on the previous page. 

We first transcribe (2.5) into a somewhat more transparent form. For each 
positive integer m, set 


F,,,(a,b,c,d) =(at+tb+cy"+(b+c+d)"—-(c+d+a)” 
~(d+a+b)"+(a—-—d)”"-—(b-c)™”. 


Put b = ax, c =ay, and d= axy, which does not contravene the hypothesis 
ad = bc. Then it is easy to see that 


Fy (4, b, C, da) = ae” fom( X; y), 


where 
fom(¥,¥) = (tx ty" + (ety t ayy" = (y tay + 1)" 
—(xy +1 4x)" + (1—xy)” — (x -y)”. (2.6) 
Hence, (2.5) can be put in the form 
64 f(x,y) fio( x,y) = 45g (x,y). (2.7) 


We first employed the computer algebra system Mathematica to verify (2.7). 
Next, using Mathematica, we attempted to find other identities like (2.7) involving 
fa,4x, y) for m < 10, but we were unsuccessful. We fortunately found a much 
more informative proof of (2.7) that is not merely a verification via computer 
algebra [6]. We will not repeat that proof here but instead offer a few additional 
remarks. 
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By inspection, we easily see that x = 0,1, —1, —2, —1/2 are zeros of f,,,(x, y). 
By symmetry, y = 0,1, —1, —2, -1/2 are also zeros. Since f,,, has degree (at 
most) 2m in each of the variables x and y, it follows that f,(x, y) =0 =f,(x, y). 
In our original notation, we have therefore proved that, if ad = bc, then 


(at+b+c)'+(b+c+d)'+(a-d)" 
=(c+d+a)'+(d+a+b)"+(b-c)’, (2.8) 
where n = 2 or 4. These are the aforementioned results that appear on page 385 
of [26]. We have therefore returned to the problem of generating equal sums of 
biquadrates. Although many results have appeared in the literature yielding two 
equal sums of three biquadrates [8, pp. 653-657], none appear as simple as 
Ramanujan’s identity (2.8). 


Are (2.5) and (2.7) merely accidents, or are they a manifestation of some far 
deeper theorem? 


3. ELEMENTARY ALGEBRA. In courses and texts on beginning calculus, stu- 
dents encounter many monotonic sequences in their study of sequences and series. 
An inquisitive student may ask for naturally occurring examples of sequences that 
increase for a while, then decrease for a while, etc. As we shall see, some infinite 
sequences of nested radicals of Ramanujan provide excellent examples. 


Mrs. Ramanujan (S. Janaki Ammal) and W. Narayanan, one of her two adopted sons. 


In 1914, Ramanujan [22], [27, pp. 327-329] posed the following problem to 
readers of the Journal of the Indian Mathematical Society: Solve completely 


x*=yt+a, y*=z+a, and z?=x+a. (3.1) 


Concomitantly, he asked for the evaluation of three infinite sequences of nested 
radicals. Toward the end of his second notebook [26, pp. 305-307], Ramanujan 
recorded further and more general results. It is not difficult to see that x is a root 
of an octic polynomial. This polynomial can be factored over the quadratic field 
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O(V4a — 7) into one quadratic and two cubic factors. These factors are correctly 
given by Ramanujan in his solution [22], but the factors given in the solution 
printed in his Collected Papers {27, pp. 327-329] contain four sign errors. 

From the equalities (3.1), we find that 


x=yaty =Vatvatz =ya+ Vat vat+x 


=yatya+Vat+vat+ . (3.2) 


Each square root should be considered two-valued, and so we are led to eight 
infinite sequences of nested radicals corresponding to the eight roots of our octic 
polynomial. First, we should determine those values of a for which the infinite 
radical in (3.2) converges. This is not an easy problem, but each of the eight 
sequences in (3.2) converges at least for a > 2 [5, Chapter 22]. As a specific 
example, let 


a,=Vva, a, = a-—va, a,=a-Va+va , 


a,= a-yYatyat+va yreey 


where the sequence of signs —,+,+,... appearing in the nested radicals has 
period 3. A careful analysis shows that 

Aen+1 > Fn4+2 > %6n+3 > Ven+4 
and 


Agn+4a < 4one5 < Gen46 < F6n+7> 


for each nonnegative integer n. Furthermore, 
0 <a, <A) < °°* <Agna4 < Uonn7 < Gena <1 <7 <a, = Va. 


Thus, {a,,,,,} and {a,,,,,} converge. Next, it must be shown that {a3,,, ,} converges 
and, lastly, that {a,} converges. The details in this analysis are not easy [5, Chapter 
22]. 

If we solve the two cubic equations mentioned above, it is not easy, in general, 
to identify the roots with the appropriate infinite sequences of radicals. For 
example, 


2A+1 
3¥3—O«Y? 


where A = ¥4a = 7. We made these identifications by expanding both the 
algebraically determined roots and the infinite radicals around “a = ©.” For 
example, both sides of (3.3) have the asymptotic expansions 

1 3 1 


—- — eo ll lh HH te 
va 2 8va 4a 


as a tends to ~. For particular numerical examples, the proper identifications are 
easier to make. For instance, if a = 2 in @G.3), 


asin) = 2-yae ave 
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A-—1 


lim a, = (3.3) 


n> 6 


2 1 
+ 344 + A sin 3 arctan 


Later, Ramanujan [25], [27, p. 332] submitted the similar problem of determin- 
ing the simultaneous solutions of the system, 


x*-=aty, y-=a+z, z*=a+u, and u*=a+x, 


to the Journal of the Indian Mathematical Society. Fourteen years elapsed before a 
solution by G. N. Watson [29] was published, while another solution can be found 
in [5, Chapter 22]. As above, interesting sequences of nested radicals arise. For 


example, 
s+ s+y5-Vs+i54-- ; 


“(2+ v5 + vis — 65) = 


where the infinite sequence of signs +, +, —, +,:°°: has period 4. 

The theory of infinite sequences of nested radicals has not been well developed, 
probably because general theorems are difficult to obtain and convergence is slow. 
For further examples, theorems, and references to the literature, see [3, pp. 
108—112] and [5, Chapter 22]. 

In the unorganized portions of his notebooks [26] and in the problem sections of 
the Journal of the Indian Mathematical Society, Ramanujan offers other problems 
on systems of equations. Thus, on page 338 of [26], he asks for the solutions of 


x°--a y?—b 


x*-y y*—-x 


where a and 5D are arbitrary constants. There are 25 pairs (x, y) of solutions. The 
special case a = 6, b = 9 appeared as Question 284 [20], [27, pp. 322—323] in the 
Journal of the Indian Mathematical Society. Ramanujan’s solution was the only one 
received, and a similar solution to the more general rroblem can be found in [5, 
Chapter 22]. 

Question 284 was the fourth problem that Ramanujan published in the Journal 
of the Indian Mathematical Society. The first five problems that Ramanujan posed 
to Journal readers were published under the name S. Ramanujam. Ramanujan 
and Ramanujam are two versions of the same Sanskrit name RAMANUJAHA, 
which means younger brother of Rama. 

We mention one further system of equations studied by Ramanujan. On page 
338 of his second notebook, Ramanujan asks, in slightly different notation, for the 
solutions of the system of 2m equations, 


xyy{  txgys tes tx,yp =a, 1<j <2n, (3.4) 


where X1,...,Xn,Viy-++> Y, are 2n unknowns, and in his short paper [21], [27, pp. 
18-19], Ramanujan presents his clever solution, which we briefly indicate. 
Ramanujan defines 


n 


9(8) = Y— (3.5) 


jar 1 - Oy, 


When (6) is expanded in a power series in 6, it is seen that the coefficient of 9% 


is a,,,,0<k < 2n — 1. On the other hand, ¢(@) has the form 
Lin 0 Aj 418" 

9) = 3.6 

(8) = TB C6) 


Clearing the denominator in (3.6) and using the aforementioned power series for 
g(@), we can determine first the coefficients B,, 1 <j <n, and secondly the 
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coefficients A,;, 1 <j <n, in terms of @,,a),...,45, by equating coefficients of 
like powers of 6. Having explicitly determined A, and B,, 1 <j <n, we substitute 
these values into (3.6) and once again expand ¢(6@) into partial fractions. Compar- 
ing the result with (3.5), we determine x, and y,, 1 <j <n. 

It is easy to see that the system (3.4) is equivalent to the single equation 


n sn-1 2n—1 on | 
 x(¥;8 + t) = j 
i=1 


Jeon-1-j 
j=0 


Thus, Ramanujan’s query is equivalent to the question: When can a binary 
(2n — 1) — ic form be represented as a sum of n (2n — 1)th powers? In 1851, 


Sylvester [28, pp. 203-216, 265-283] found the following necessary and sufficient 
conditions for a solution: The system of m equations 


AjUy + Aj, {Uy +++ Fj 4,Uny1 = 9, l<j<n, 

must have a solution u,,U,,...,uU,4, , such that the m — ic form 

n . 

p(w, z) = Livy wiz" 

j=0 
can be represented as a product of n distinct linear forms. This is true for a 
general 2n-tuple (a,,a,,...,a@,) in the sense of algebraic geometry. Thus, the 
numbers y,,y>,...,y, are related to the factorization of p(w, z). Sylvester’s 


theorem belongs to the subject of invariant theory, which was developed in the late 
19th and early 20th centuries. For a contemporary treatment, but with classical 
language, see a paper by J. P. S. Kung and G.-C. Rota [14]. 

We next consider the following theorem of Ramanujan [26, p. 325]. 


Theorem. Let a, B, and y denote the roots of the cubic equation 


x> —ax*+bx -1=0. (3.7) 
Then, for a suitable determination of roots, 
al/3 4 B13 4 yI/3 = (a+6+4 3t)'” (3.8) 
and 
(aB)'”? + (By)'” + (ya)'? = (b+ 64 3t)”, (3.9) 
where 
t?— 3(a +b + 3)t — (ab + 6(a + b) + 9) = 0. (3.10) 


Since this beautiful elementary theorem is evidently new and since a short proof 
can be given, we provide one here. 
Proof: Noting, from (3.7), that aBy = 1, let 
z?>—6z*+9z-1=0 (3.11) 


denote the cubic polynomial with roots a!/°, B'”7, and y!”7, chosen so that their 
product equals 1. Cubing both sides of the equality 


z>-1=0z"- oz, 
we find that 


(z° - 1) — 092° + oz? + 309z7°(z* — 1) = 0. (3.12) 
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Since a!/7, B'/3, and y!/” are roots of (3.11), they are also roots of (3.12). As a 
cubic polynomial in z*, (3.12) thus has the roots a, B, and y. 
Comparing (3.7) and (3.12), we deduce that 


a=60°+3 — 309 (3.13) 
and 

b= 9° +3 — 369. (3.14) 
If we define t by 

e=a+6+4 3t, (3.15) 


then, by (3.11) and (3.15), 
al/3 4 B34 y/3 =69=(a+ 6+ 3t)”, 
which proves (3.8). Also, by (3.13)—@.15), 


p=b-3+309 =b+0°-a=b+6+ 3t. (3.16) 
Hence, by (3.11) and @.16), (3.9) is established. From (3.13) and (3.15), 
3 +t = 09. (3.17) 


Thus, by (3.15)—@.17), 
(3 +t)? = 079? = (a + 6+ 3t)(b + 6 + 3). 


Expanding both sides, collecting terms, and simplifying, we deduce (3.10). 
On page 356 of [26], the last page of the second notebook, Ramanujan offers 
the equalities 


2a \'/3 4a \'/° qr \1/3 3 1/3 
[cos — + (cos =] ~ {008 =] - {50 -2)| (3.18) 


and 
2a \'/° 4a \'7° tT \1/3 
[sec =| + [sec =| — (sec =| = {6(9177 — 1}, (3.19) 


which are applications of (3.8) and (3.9), respectively, with a = 0, b = —3, and 
t= —93, Equality (3.18) was posed as a problem by Ramanujan in the Journal of 
the Indian Mathematical Society [23], [27, p. 329]. Proofs of (3.18) and (3.19) can 
also be found in Berndt’s book [5, Chapter 22]. 


4. NUMBER THEORY. Suppose p is a prime and n is a positive integer. Then, by 
a well-known theorem in elementary number theory [19, p. 182], the highest power 
of p dividing n! equdls 
“ n 
> Fa =: N, 
k=1 
Despite the widespread use of this theorem by number theorists for many years, 
the inequalities 
n log(n + 1 n—-1 
— - oan ¥ 1) <N < ——., (4.1) 
p-1 log p p-1 
given by Ramanujan [26, p. 378] in his third notebook do not appear to have been 


heretofore noticed. Both inequalities in (4.1) are sharp. If n = p™” for some 
positive integer m, an elementary calculation shows that N = (n — 1)/(p — 1). 
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On the other hand, if n = p”*! — 1, by a direct calculation with the observation 
that m + 1 = log(n + 1) /log p, 


n log(n + 1) 
p-1 logp 


In fact, Ramanujan stated (4.1) with p replaced by an arbitrary positive integer 
a> 2. 

Bhargava, Adiga, and Somashekara [7] have given one proof of (4.1) when p is 
any positive integer exceeding 1. We offer another proof here. 


Proof of (4.1): First, by writing n in base p, i.e., by setting 


n= )ibp’, O<b<p-1, 5b, #0, 
j=0 


we find, after a straightforward calculation, that 


: > b,, (4.2) 


p-1l p-—t j2Z 


and so the second inequality in (4.1) follows. 

The first inequality in (4.1) is more difficult to establish. We are very grateful to 
B. Reznick for supplying the following elegant proof. 

Set 


b= Vb. 
j=0 


Then, by (4.2), it suffices to prove that 


bs (p-) (4.3) 
Write 
b=k(p-1)t+r, O<sr<p-2. (4.4) 
Then 
n>(p-1)p°+(p-1)p+(p- 1p? t-- +(p- 1)p** + 1p 
= (r+ 1) p* — Ij. 
It follows that a 
k 
(9 - A 2s (p- 
=kp-1)+(p- 9 as) 
log p 
By (4.3)—(4.5), we shall be finished with the proof if we can show that 
pac BCL). a 
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First, if r = 0, (4.6) clearly holds with equality. 
If r > 1, (4.6) can be written in the form 
r p-1 
a < ——____.. , 
log(r + 1) log p 


or 
f(r) sf(p — I), (4.7) 


where 


However, by elementary calculus, f(x) is strictly increasing for positive integral x. 
Since 1 <r <p — 2, (4.7) is therefore valid with a strict inequality, and so the 
proof is complete. 

As remarked in the Introduction, we conclude this short sampling of Ramanu- 
jan’s elementary discoveries with a note on 7. Continued fractions provide 
excellent rational approximations to 7. Thus, the simple continued fraction 


1 1 1 1 


=3+>,—>,7-,.=> 

" 7+ 15 +1+ 293 + °°: 
“elds th a 22 333 355 N h 
) t =>, t t 
ylelds the successive approximations 7° 106’ 113” ote tna 


*°° 3.14159 29 
113 — ry oeey 
which agrees with the decimal expansion of 7 = 3.14159 26535... through 6 
decimal places. The appearance of a “large” fourth partial quotient, 293, is 
primarily responsible for this success. 
Taking a brief diversion in his famous paper on approximations to 7 [24], [27, p. 
35], Ramanujan offers the approximation 


1 1)\'4 
= |97-— — — = 3.14159 26526... 4.8 
17 | 5 Ty | ; (4.8) 
which “was obtained empirically.” How did Ramanujan deduce this unusual 
approximation, which is also found in his second and third notebooks [26, pp. 217, 
375]? N. D. Mermin [16], [17, pp. 304—305] has offered the best explanation for 
Ramanujan’s approximation (4.8). In the decimal expansion of wi = 
97.409091034002 ..., observe that the pair of digits 09 appears twice in succession 
followed by the pair 10; which is ‘close’ to 09. Thus, 


2143 1 1 
97.40909090909.... = ~ 97 


22 2 11 
is a natural approximation to 7‘. 
Ramanujan’s facility with continued fractions is unequaled in mathematical 
history, and so he might have observed that [16], [17], [4, p. 151] 
‘97 1 1 1 1 1 1 
=97+-,>,>,->,=>.-7 . 
" 2+2+3+1+ 16539+1+-°° 
Truncating this continued fraction just before the “super large”’ partial quotient 
16,539 gives the approximation (4.8). 
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We are very grateful to Richard Askey, R. William Gosper, Daniel Grayson, K. 


Srinivasa Rao, and Bruce Reznick for valuable contributions. 
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Chebychev Polynomials and Regular 
Polygons 


D. Y. Savio and E. R. Suryanarayan 


1, INTRODUCTION. Chebychev polynomials occur in many branches of mathe- 
matics: interpolation theory, orthogonal polynomials, approximation theory, nu- 
merical analysis, ergodic theory, etc. It is said that the Chebychev polynomial is 
like a fine jewel that reveals its different characteristics under illumination from 
varying positions [2]. There is yet a simple spot it shows its radiance: Regular 
Polygons. In this paper we study some of the properties of the Chebychev 
polynomials of the second kind u,(x), (2) below, and a polynomial associated with 
it, namely, u, + u,_, and learn that the polynomials are related to some of the 
properties of the regular polygons. Specifically, we generalize a result due to 
Kepler (1571-1630). Kepler observed that the squares of the edges of polygons {7}, 


{3 \, {?| of unit circumradius (all having the same 7 vertices) are the roots of the 


equation, 
z>—7z* + 14z7-7=0; (1) 


here, {7} is a regular heptagon; {3| and {7| are star-polygons [1] (see Figure 1). 


2. CHEBYCHEV POLYNOMIALS. Let x = cos 6. Chebychev polynomials of the 
second kind are defined recursively by [2], u,(x): 


sin(n + 1)6 
u(x) = 1, u(x) = 2x,...,U,(x) = —~~——, n=1,2,.... (2) 
u,(x) satisfies the classical recurrence 
U(X) = 2XU, 1 — Uy—2- (3) 
The zeros of u,(x) are | 
cos kK=1,...,n (4) 
n+1 


[2]. We need the following result for later use. 


Theorem 1. Let v,(x) = u,(x) + u,,_ (x). Then the zeros of v,(x) are cos(2k7/ 
2n + 1). 


Proof: From the trigonometric definition of u, and elementary trigonometric 
identities, we find that v, = (sin(n + 5)0/sin(@/2)). Therefore, the zeros of v,(x) 
are cos(2k7/2n + 1). a 
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Let 


2X 1 0 O 0 
1 2X 1 O «=>: 0 

u(xy)=fO 1 2 1 Of (5) 
O -: 0 0 1. 2x 


where the matrix is symmetric, tridiagonal and is of order n. The result 
u,(x) = Det U,(x) (6) 
follows from the recurrence (3) [3]. 


Theorem 2. The eigenvalues of U(x) are 2(x — 1) + 4sin*(ka/2n + 2), k = 
1,2,...,n. 


Proof: From (6) the eigenvalues of U,, are the zeros of u,(x — (A /2)); but from (4) 
the zeros of u,(x — (A/2)) are given 2x — 2cos(ka/n + 1). Therefore, the eigen- 
values of U, are 2(x — 1) + 4sin*(ka/2n + 2). | 


There is a pair of recurrences less well known which also generates u,: 
Urn, = (u,, + Un+i)(Un+1 a Un—1)> (7) 
Udn+1 = Uy,(Un+1 —Uyn_1); n= 1,2,... (8) 


The preceding “odd-even” breakdown can be proved by using the definition (2) 
and elementary trigonometric identities. The above relations suggest that u, is 
always the product of two determinants which are themselves “‘Chebychev polyno- 
mials” in the sense of u, + u,_, and u,,, — u,_, Satisfy recurrence (3). Indeed, 


Uz, + Up_y = 2XUp_y — Ug_y £Uy_, = (2K + L)uy_, — Uy_2 


2x 1 0 0 0 
1 2x 1 O ss: 0 
= Det| 0 1 2x 1 0 (9) 
0 0 O 1 2x + 1 
and 
2x 1 0 O ::: 0 
- , 1 2x 1 O -::: 0 
Upay — U,p_y4 = 2xu, — 2u,_, = Det 0 1 2X 1 ue 0 (10) 
0 -. 0 0 2. 2x 
differ from (5) just at a single entry. 
When n = 2k + 1, let 
2x 1 0 0 0 
1 2x 1 O 0 
y=|0 1 2x 1 O |, n=1,3,5,.... (11) 
O «+ 0 0 1 2x41 
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denote the square matrix of order k. When n = 2k, let U, be defined by (5). All 
statements with U, and V, assume n = 2k and n = 2k + 1 respectively. 


Theorem 3. The eigenvalues of V(x) are 2(x — 1) + 4sin*(k7/2n + 1), k = 
1,2,...,H. 


Proof: The eigenvalues of V, are obtained by solving the equation 
Det V,(2(x — A/2)) = 0, for A. Since Det V, = v, and the zeros of v,(x — (A /2)) 
are 2(x — 1) + 4sin?(k7/2n + 1), we see, from an argument similar to theorem 
1, that the zeros of v,(x — (A/2)) are indeed the eigenvalues of V,. a 


3. APPLICATION TO REGULAR STAR-FIGURES AND POLYGONS. A regular 
star-figure is a figure formed by connecting with straight lines every gth point, 
starting with one of the points that divide a circumference into n equal parts 
(2q <n); if all the n points are not connected, then start from the unconnected 
point next to the initial point, and repeat the connecting procedure until all the n 
points are connected. Such a star-figure is denoted by {n\. If g = 1, we have a 
regular convex polygon {n} of n sides. If nm and gq are relatively prime, the 
star-figure is a star-polygon or an n-gram. For a given n there are f(n)/2 regular 
n-grams where $(n) is the Euler function, the number of numbers less than n and 
prime to it. If m and g are not relatively prime then {x} is a symmetrically 


5 


7 is a pentagram, whereas, the 


superposed convex polygon. For example, { 


Star-figure {s} (the star of David) is formed by two equilateral triangles symmetri- 
cally superposed. 

Regular octagons and 16-gons occur in the mural decorations of ancient Egypt. 
Pentagrams and hexagons were used by the Babylonians. Pythagoreans used 
pentagram as a symbol of good health and also as a badge of recognition. Hindus 
use the star of Lakshmi , to symbolize the eight forms of wealth (Ashtalakshmi). 
Buddhists and Hindus draw, an elaborate form of star-figures and star-polygons, a 
mandala, on the ceremonial altars. The systematic study of the star-polygons was 
initiated and some of their properties were developed by Bradwardine (1290-1349), 
an English cleric who became Archbishop of Canterbury for the last month of his 
life. 

Consider a regular star-figure of n sides. Let O be the center, M the mid-point 
and A one end of the side, and let AM = (//2) and OA = R be the circumradius 
of the star-figure (FicuRE 2). The angle AOM is 7 /n for {n} and g7/n for the 
star-figure “\ and the edges are 2R sin(7/2n) and 2R sin(q7/2n) respectively. 

From the above results and theorems 2 and 3, we have the following generaliza- 
tion of Kepler’s observation: 


Theorem 4, Let | be an edge and R the circumradius of a regular star-figure of n 
sides. The eigenvalues of the matrices V,(1) and U,(1) are the ratios (1/R)* of the 


regular star-figures {"}, J=1,2,...,k, ifn =2k +1, andj =1,2,...,k —1, if 
n = 2k. 


As a special case, consider the characteristic equation of V,(1): 
2—-A 1 0 
1 2-—A 1 = 0. 
0 1 3-—A 
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@) {3} w {7} © {3} 


(3) 


(a) 


Figure 2 
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On expansion, the above equation reduces to A — 7? + 14A — 7 = 0. Similarly, 
the three roots of the characteristic equation of U;(1): 


2—A 1 0 


give (J/R)? for {8}, {8}, {8}. 


3 
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Small-Group Learning 


Julian Weissglass 


In any math class I’ve been in before, I just sat and listened to a teacher talk about what was in the 
book and what would be assigned for homework. When I got to college the same thing was 
happening except here I take notes on what the professor says. I have never felt in any of these 
situations that I should express my opinion on the subject. The most any teacher has done to 
stimulate a discussion on the topic was to simply say ‘Questions?’ Whenever the teacher said this, 
though, it didn’t sound like he wanted a reply. 


Junior Mathematics Major 


In a span of less than two years, three national reports [6, 9, 10] have recom- 
mended fundamental changes in the teaching of college mathematics. The most 
recent document Moving Beyond Myths [10] states, for example, that “It is widely 
recognized that lectures place students in a passive role, failing to engage them in 
their own learning. Even students who survive such courses often absorb a very 
misleading impression of mathematics—as a collection of skills with no connection 
to critical reasoning” (p. 24). The document recommends that faculty, among other 
things, “explore effective alternatives to ‘lecture and listen’ ”’, ‘involve students 
actively in the learning process,” and “teach future teachers in the ways they will 
be expected to teach” (p. 34). 

If we take seriously the charge to “teach future teachers in the ways they will be 
expected to teach,” a reading of the Professional Standards for Teaching Mathemat- 
ics [8] (which is referred to in Moving Beyond Myths) will lead to using small group 
approaches for at least part of the class time. This document states, ‘Students 
learning of mathematics is enhanced in a learning environment that is built as a 
community of people collaborating to make sense of mathematical ideas. It is a key 
function of the teacher to develop and nurture students abilities to learn with and 
from others—to clarify definitions and terms to one another, consider one an- 
other’s ideas and solutions, and argue together about the validity of alternative 
approaches and answers...” (p. 58). 

Reports and recommendations, of course, do not make changes in the class- 
room. Only teachers doing things differently achieve that. Changing teaching, 
however, is not easy. There is both individual and institutional resistance to change. 
My own experience ‘with resistance occurred during my first attempt to use a small 
group approach in a linear algebra class I taught in 1970, my third year as a faculty 
member. Although the students liked the class I was so afraid that my colleagues 
would find out what I was doing that I closed the door of the classroom in case any 
of them walked by. My anxiety caused me to abandon the approach for three years. 

At the institutional level, it was not until 1991 that the MAA annual meeting 
provided a special session devoted to alternatives to the lecture method, although 
articles [3, 11] appeared in the 70’s describing this approach in mathematics 
courses and an increasing number of studies (see [4] for references) showed the 
effectiveness of small group cooperative learning approaches. 

Having overcome, to some degree, my own resistance to pedagogical change, I 
thought it would be helpful to offer some suggestions to faculty considering 
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implementing small group approaches. In a sense, this is the article I wish I had 
been able to read twenty years ago. 


BEGINNING. Do not be afraid to start slowly. It is not necessary to abandon 
lecturing completely. For some purposes it is a good method. You can combine 
lecturing, students working in groups, and whole-class discussion in any proportion 
you desire. One way to start is to have students form a group or pair and discuss 
how they solved a homework problem. Alternatively, you can pose an open-ended 
question for them to think about. Have them write about it (with a ‘quick write’) 
and then report their initial thinking to their group. Providing students time to 
think and write individually before sharing in the group is often helpful. Not all 
students want to start talking right away. 

Another workable method is to set aside a portion of class time for students to 
discuss a concept or work on a problem, an investigation, or a group project. Some, 
or all, groups can report on their work to the class (either as a progress report or a 
final report). You can add perspective and background information as needed. In a 
large class, where it is cumbersome to use groups, you might have the students 
spend some time working in pairs—discussing a definition, sharing thoughts about 
a problem, comparing solutions or exploring a concept. Your Teaching Assistant 
can, with some encouragement from you, use small groups in discussion sections. 

In order to ease the transition from lectures, provide an experience early on in 
the course demonstrating that a small group approach enhances learning in ways 
that lectures do not. For example, I often begin my class on problem solving with 
Counting Squares. Students are given a problem (Ficure 1) and asked to work 
individually. 


Counting Squares 
(individual) 


How many squares are there in the figure below? Be able to defend your answer. Work 


by yourself. 


Figure 1 


After about 10-15 minutes they are arranged in groups and given the problem 
in FIGURE 2. 


Counting Squares 
(small groups) 


How many squares are there in the figure below? Work in your groups. Make sure that everyone 
is able to defend the answer. 


Figure 2 
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After the activity, I ask them to reflect on the process with the following 
instructions: Each person tells how they felt when doing the problem alone and 
as a group. Discuss the differences between individual learning and small group 
learning. 

Another activity that shows students how small groups can enhance learning is 
Missing Corners. In this activity the students are asked to (individually) write a 
description of the pattern in FiGure 3, construct with cubes (or tiles) the next two 
figures in the pattern, predict the number of cubes in the nth figure and write a 
justification for their prediction based on the figures. The students are amazed 
when they arrive at different ways of describing the pattern and justifying the 
predictions. (This can be followed by examining more complex, even 3-dimen- 


sional, patterns.) 


Figure 3 


One obstacle when college students begin to work in groups is their lack of 
experience communicating about mathematics. It is important therefore that early 
group activities develop communication skills rather than stress solutions or proof. 
A colleague of mine, Bill Jacob, addresses this issue in a geometry class by having 
one student draw a geometric figure, a second write instructions on how to draw 
the figure for a third student, who then draws the figure without having seen the 
original figure. The figures are compared and the results discussed. Then the roles 
are rotated. He also has the students write reports on experiments (for example, 
projective geometry experiments with mirrors). 

There are not many examples of college level curriculum written specifically for 
small group instruction. Two older texts [5, 12] attempt to present traditional 
course content for a small group approach. There is more available for the 
pre-college level and these sources may provide ideas for what can be done at the 
college level. The Interactive Mathematics Project! and the California Math A 
materials” are good examples of non-traditional approaches at the secondary level. 
Bishop [1] is a good resource for thinking about how to restructure curriculum for 
group projects and discussion. I used Thinking Mathematically [7] successfully in a 
problem solving course for potential secondary teachers. A good source for reading 
about what other people have done is [4]. Be aware, however, that some of the 
authors in this book have a very traditional view of mathematics and there is 
considerable disagreement about classroom practices as well. 

It may be necessary to change your ideas of “‘covering” curriculum. It will help 
to reflect on the questions: what does it mean to teach? what does it mean to 
learn? College faculty need to think about and discuss the relative value of 
exposing students to mathematical knowledge or having them actually do mathe- 


‘This project is developing a three year problem-based mathematics high school mathematics 
course. Contact Interactive Mathematics Project-EQUALS, University of California, Berkeley, CA 
94720. 

*This material was developed by California secondary teachers to meet the guidelines of the 1985 
California Framework. It is being rewritten by Larry Hatfield for publication by Glencoe Publishing 
Company in June, 1993. 
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matics. For example, a class for potential secondary teachers explored symmetry by 
examining some strip patterns from Native American (San Ildefonso Pueblo) 
pottery. I then asked them to create their own strip patterns with pattern blocks 
(colored squares, triangles, trapezoids, parallelograms, and hexagons). I then asked 
the students to classify the strip patterns. With very little help from me most 
groups (in three to four hours of class time) discovered the seven different classes 
and were able to justify (although not rigorously) that these were all of them. I 
could have lectured about it in an hour or two, but I think that the level of 
understanding would have been shallower. There are no easy answers to questions 
about breadth versus depth—perhaps no answers at all—but it is beneficial to 
reflect on and discuss the questions. 


STUDENTS. Students will probably be skeptical about participating in a small 
group at first. You need to explain to them why you are deviating from the 
traditional lecture method—and remind them periodically of your reasons and 
“philosophy of education”. Some students may continue to struggle with the 
different approach: 


Once again as in Math 101A I have mixed feelings arising out of working in groups. One thing, 
working in small groups tends to make me more visible to others. That means my strong points, in 
between points, and weak points are right there for everyone to see. It is very hard for me to expose 
my weaknesses to others, i.e., my mistakes. Working in small groups tends to make me confront a 
feeling of stupidity. It is hard to overcome the urge to compare myself to others and to try to come 
up with the correct answer. I tend to underplay ‘correct’ contributions (i.e., a good idea) and 
overplay any errors I make, so it tends to be a struggle for me, 


Others will make the transition more readily: 


To be honest, when this class first began I did not enjoy it very much. It is hard for me to pin down 
why. In part it had a bit to do with the groups. It was not the fact that I did not know my group yet. 
I knew that we all would get to know one another. It was more because the people in my group 
seemed to be so much brighter than me. It did not seem as if I would ever have anything worth while 
to contribute... . After a few weeks of classes we all felt comfortable. We not only discussed math 
topics but also what we did over the weekend. How things were going etc. It was no longer a state of 
unfamiliarity or any anxiety over making a mistake or saying something foolish ... [If we had not 
been in a group| I do not think we would have become friends. 


Many come to understand and value the benefits of small group instruction: 


Working in small groups is very different than lecturing only. There is no strict relationship where 
one person knows all the answers (teacher ) and the other asks the questions (student). Working in a 
group is a more equal relationship where hopefully everyone is answering and asking questions. I like 
working in a small group,because it forces you to think rather than just copy whatever the teacher 
writes. 


To be honest, in the past the only method that I knew to learn mathematics was to memorize so, 
therefore, if I memorized the material well I felt pretty good as a learner, but now, however, I 
realize that I have been somewhat cheated on what and how I learned mathematics. It just seems 
like I should have a better understanding of what I have learned in the past. 


It is important to pay attention to the quality of the group process. Every three 
to four weeks I have students assess their group’s functioning. I ask them to answer 
two questions in their weekly journal: What are you doing to contribute to the 
group’s functioning well? What can you do to improve? Then I visit each group 
and sit down with them and ask each person to talk about their answers. This 
method provides them time to think about the questions free from pressure, but 
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ensures that the group is communicating about group processes. It also indicates 
clearly that I value group process, since I devote my time to assessing it. 

Take time to interact with students. They will be uneasy about working in 
groups and will need time to talk about it. The relationship with the instructor is 
crucial in making the small group approach work. One student addressed the issue 
in his journal: 


I believe that a very useful addition to this course would be to require, perhaps during the second or 
third weeks, each student to make an appointment during office hours. I believe a one on one 
discussion on ‘what do you want to get out of this course?’ to ‘what do you want to get out of 
teaching?’ would enhance the entire course. I believe that it would make the students even more 
aware of what they can get out of the course, as well as, being aware of the usefulness of being 
available to students for one-on-one talks. Offering offices hours does a lot. However, requiring us 
to take advantage of office hours would be excellent. 


Be aware of the effect of grading on small groups. The first day (of a course in 
problem solving for prospective secondary teachers) I told the students how I 
would grade (see Figure 4) and in particular that I did not value memorization 
but would assess progress in their mathematical reasoning and their ability to 
communicate (verbally and in writing) about mathematics. I told them that I 
wanted to try (for the first time) using student portfolios to assess their work. 


Attendance 10%; 
Contribution to group and class (including an 
assessment of a portfolio of their work) 20% 
Five problem sets 30% 
Journal 20% 
Final exam (oral) 20% 
Figure 4 


They were a little uneasy about this, so in the third week of the semester I 
spoke more about my philosophy, grades, and what portfolios were. I asked them 
to suggest what kind of evidence would show growth in mathematical thinking. We 
made a list and I indicated that they should include this type of evidence as part of 
their portfolio. Because I had devised what I thought would be a very acceptable 
grading method, and spent some time in class discussing it, I was surprised to read 
what one student wrote in her journal: 


The assessment lecture bothered me. Up to that day working in the groups and learning was fun. I 
had been thoroughly enjoying the class but when the portfolio came up and I realized that something 
was going to be ‘graded’ my perspective of the class began to change. All of a sudden I had to pay 
attention to what I was writing down. ‘Is it neat enough?’ Am I writing enough?’ ‘Have I misplaced 
something that I should have kept?’ These questions and slight panic began to be aroused in me. 
That day our group discussion was much more jumpy and less relaxed. For the first time our ideas 
came across in a competitive way. I cannot really explain why we became more interested in getting 
our ideas on paper than playing with the problem. With the knowledge that our progress was going 
to be measured, our performance became more forced and less enjoyable. 


I have not solved the problem raised by this student. Certainly anxiety about 
grades is not unique to the small group approach. I have long believed that any 
“outside” (by someone other than the learner) evaluation of learning interferes 
with the learning process—with the possible exception of assessment conducted as 
an integral part of the learning process with the goal of assisting the learner. 
Furthermore I consistently find that my dual responsibilities of facilitating learning 
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and evaluating it, are inconsistent. Ideally a learner would be willing to reveal 
his /her ignorance to a teacher. In reality, he /she may be reluctant to do so to an 
evaluator. Although the small group approach reduces the interference of grades 
with learning (for example, anxiety is reduced by talking about grades with friends, 
students are graded on more than just test results) it does not eliminate it. I am 
not comfortable with grading and I admit my dilemma to my students. After 
reading the above student’s journal, I read the passage (anonymously) to the class 
and we discussed grades, competition and learning. It seemed to help. 

While on the subject of assessment, it is worth pointing out the obvious. Often 
students memorize to get by on tests. Success in this system does not necessarily 
mean that students have learned (understood and are able to use) the mathemat- 
ics. We often do not notice this when using the lecture method because we only 
see test results. When you observe students working in groups, however, you will 
see more clearly what students do and do not understand. It can be disconcerting. 
In a class on classical number systems, for example, I asked students to use a 
concrete model to justify the familiar algorithm for adding fractions. They had 
tremendous difficulty coming up with an explanation. 

A final point in regard to your students: Do not be too hard on them. There is 
an old saying “Don’t blame the messenger who brings bad news.” In a sense, 
undergraduates who cannot think or communicate well about mathematics are the 
message that something is drastically wrong with our education system. Small 
group instruction is not a panacea. It will not immediately remedy the deficiencies 
of previous miseducation. But it is a start. 


INSTITUTIONAL AND PERSONAL SUPPORT. It will not be easy to give up the 
lecture method. Both institutional and personal support will be helpful in making 
the change. The Action Plan of Moving Beyond Myths makes many institutional 
recommendations. Draw these to the attention of relevant officials and organize on 
your campus for implementing the suggested reforms. 

At present there is littlke opportunity for college faculty to participate in 
professional development focused on teaching. The educational community re- 
gards professional development in both content and methodology as a necessary 
part of pre-college teachers’ professional growth. For college instructors, however, 
professional development focuses on learning more mathematics or on suggested 
revisions in content, not learning about new pedagogical approaches or research in 
mathematics education. 

Until there is adequate opportunity to participate in professional development 
activities focused on teaching, individuals will have to strike out on their own. It 
may be possible to arrange your own professional development by watching 
someone who is using small groups or participating in a small group experience 
taught by someone else. In the long run, however, the attitudes and practices 
within the profession concerning professional development will need to change if 
large numbers of college faculty are to obtain the support necessary to implement 
the goals of the reform movement. 

Even with institutional support for change you will need to get personal support 
if you intend to change your teaching. Find people with whom you can discuss 
mathematics teaching—your ideas, your successes and failures. (Accept that there 
will be failures.) In addition, find someone who is able to listen to you non-criti- 
cally. It will be helpful to reflect on what you are doing and deal with your feelings 
about your efforts without fear of criticism. I did not have that 20 years ago and 
that is one reason why I abandoned my experimentation for three years. When you 
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are feeling tense or worried about whether you are doing the right thing you will 
tend to revert to the ‘tried and true.’ If you have someone to talk to about your 
feelings it is more likely that you will be able to think through the issues, and 
pursue your goals. See [13, 14] for further information about the relationship 
between feelings, listening and educational change. 


CONCLUSION. Teaching using small groups is very different from lecturing. You 
will need time to develop your abilities. Be prepared for ambivalence and doubts. I 
encourage you to persist. Virtually every teacher (elementary or secondary) takes 
mathematics courses in a college or university. How you teach mathematics to 
undergraduates affects mathematics education throughout the entire system. You 
can play a crucial role in modeling for future teachers how to teach so that 
students are actively engaged in doing mathematics. If you are satisfied with large 
numbers of students not understanding or liking mathematics, with an attrition 
rate for mathematics students of approximately 50% each year after 9th grade [2], 
then continue with the lecture method. But if you want to provide opportunities 
for larger numbers of students to gain deeper understandings and to improve their 
ability to communicate about mathematics, then explore small group approaches 
and other alternatives to the lecture method. Perhaps you will be rewarded by 
having a future secondary teacher write: [ want to implement in my classroom what 
we did in this class. The most important thing that I have learned is that math can be 


fun. 
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A Fast Pick-Type Approximation 
for Areas of H-Polygons 


Ding Ren, Krzysztof Kolodziejczyk, Grattan Murphy, 
and John Reay 


1. INTRODUCTION AND DEFINITIONS. Pick’s formula 5/2 + i — 1 gives the 
area of a simple polygon in R* whose corners lie in the integer lattice, and which 
has b lattice points on its boundary and 7 lattice points in its interior. It has been 
the object of many studies since its proof by Pick [6] in 1900. Throughout this 
paper we assume P is an H-polygon, i.e., a simple polygon whose corners lie in the 
set H of vertices of a monohedral tiling of R* by regular hexagons of unit area. 
See Figure 1 for examples. The vertices of this hexagonal tiling have density 2 


Figure 1. Mensurable H-polygons. 


(that is, each hexagon may be associated with 2 vertices), in contrast to the points 
of the integer lattice used in Pick’s theorem, which have density one. Therefore it 
is reasonable to define Pick’s approximation for the area w(P) of an H-polygon P 
by 


F(P) =(b/2+i-1)/2. (1) 


For example, in Figure 1 if P is the triangle ABC or triangle DEF then b = 3, 
and i= 0, so Pick’s approximation is F(P) = 1/4, while the true areas are 
uC ABC) = 1/6 and w(DEF) = 1/2. Also triangle GHI has area w(GHI) = 1/2 
and approximation F(P) = (3/2 + 1 — 1)/2 = 3/4. In the next section we find 
bounds on the size of the error of this Pick-type approximation for the area; this 
will show that F(P) is, in some sense, a very good approximation. 

The exact area of many H-polygons, like those in FiGurE 1, may be found by 
computing one additional parameter, the boundary characteristic. Every vertex 
X € H of the hexagonal tiling that is also on the boundary 0P of P is the endpoint 
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of 3 edges of the hexagonal tiling. Define the boundary characteristic c(X, P) of P 
at X to be the number of those 3 edges that extend locally into the exterior of P 
from X, minus the number that extend locally into the interior of P from X. Then 
the boundary characteristic of P is defined as c = c(P) = Lye ynapCc(X, P). For 
example, if P is the irregular hexagon JKLMNOQ in Ficure 1, then b = 7, i = 0, 
and c=2-—-1+2+3-3+3+1=7. If points of H occur frequently along 
the boundary oP of P (specifically, if the neighboring H-points on dP are closer 
than the distance from R to S in Figure 1), then it is shown in [2] that the area 
u(P) of P is given exactly by 


A(P) =b/4+i/2 + c/12—-1. (2) 


(This is easily checked for the examples in Figure 1.) An H-polygon is called 
mensurable if u(P) = ACP). The mensurable H-triangles have been characterized 
in [5]. Using the Pick approximation F(P) for the area of P is faster than 
computing areas with (2) since it saves the computation of the boundary character- 
istic. In the next section we first get sharp inequalities between the parameters b 
and c for H-polygons, and then use them to justify the use of the fast Pick’s 
approximation. Scott [7] and Coleman [1] have considered similar inequalities 
between b and i for convex polygons with corners in the integer lattice. See [4] and 
[8] for related results and further bibliography on Pick’s Theorem. 


2. BOUNDARY CHARACTERISTIC BOUNDS AND PICK’S APPROXIMATION. 
Let P denote any simple H-polygon with b = |H M dP| and boundary characteris- 
tic c. Triangles GHI and DEF of Ficure 1, (and other examples with any b > 3) 
show that the inequalities in the following theorem are sharp. 


Theorem 1. For any simple H-polygon, —b <c —6 <b. 


Theorem 1 may be used to provide a bound on the size of the error which 
occurs in using Pick’s formula F(P) to approximate the area n(P) of mensurable 
H-polygons. 


Theorem 2. If P is a mensurable H-polygon then 


F(P) - n(P)| < b/12. 


Proof: If P is a mensurable H-polygon then ACP) in formula (2) gives the exact 
area u(P) of P. Use the inequalities c < b + 6 and c => —b + 6 from Theorem 1 
to replace c in the formula (2), and simplify. The result is immediate. | 


The triangles in Figure 1 show that the bound in Theorem 2 cannot be 
improved in general. 


Proof of Theorem 1: We will choose a point in the relative interior of a side of P 
and traverse the boundary 0P once in a counterclockwise direction, keeping track 
of changes in two parameters, the boundary characteristic and the deflection 
number, which will change only at points of H © 0P. The deflection number as 
used in this proof will be defined in Table 1 for each X € HM OP in such a way 
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that it is always an integer multiple of 1/6. The sum of the deflection numbers 
over all such X will agree with the rotation number of P as defined in [3] and [4], 
and will always be 1, which represents the fact that the direction of travel makes 
one complete rotation (of 277) as we traverse once around OP in a counterclock- 
wise direction. We may assume that the tilting at a typical vertex X € HM OP is 
oriented as shown in FiGuRE 2, so that the 3 edges of the tiling which meet at X, 


Figure 2. Sectors determined by a typical X € H 2 OP. 


together with their reflections in X, form 6 sectors about X, each of size 7/3. 
Then the side S of P being traversed either approaches vertex X along the 
segment WX which is parallel to a tiling edge, or through the interior of sector 
WXY (as shown in FicureE 2) thereby dividing sector WXY into 2 smaller sectors. 
In either case, let © be the angle between WX and side S, with 0 < 6 < 7/3. The 
sectors (numbered 1 through 7 in FiGurE 2) and the deflection number d(X) for 
each sector are defined in Table 1. 


TABLE 1. Deflection number when the boundary traverse 
leaves X in sector i. (Sector 7 does not exist 
if @ = 0 and side S contains WX.) 


Sector Angle ¢ from XW Deflection 
Number to leaving side No. d(x) 
1 0O<¢<7/3 —3/6 
2 t/3<o < 21/3 ~2/6 
3 27/35 o<7 —1/6 
4, t<¢d < 41/3 0 
5 4nr/3< $ < 57/3 1/6 
6 S7/6< 6 < 27 2/6 
7 0<¢<O0 3/6 


We distinguish three types of vertices X © H M0P depending on how the 
boundary passes through X on our traverse: 


Type 1. The boundary traverse approaches X along side S with © > 0 (as shown 
in Figure 2) and leaves X through the interior of some sector, or else, the 
traverse approaches X along WX (so © = 0 and Sector 7 does not exist) and 
leaves X on a side parallel to an edge. 
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TABLE 2. Bounds for the boundary characteristics. 


Sector Type 1 Type 2 Type 3 
Number Cu Cm Cu Cm CM Cm 
1 —3 —3 (no such vertices) —2 —3 
2 —1 —3 —2 —3 —1 —2 
3 ~—1 —1 —1 —2 0 —] 
4 1 —1 0 —1 1 0 
5 1 1 1 0 2 1 
6 3 1 2 1 3 2 
7 3 3 3 2 (no such vertices) 


Type 2. Angle © > 0 and the traverse leaves X on a side which is parallel to an 
edge. 


Type 3. The traverse approaches X along WX (so © = 0) and leaves through the 
interior of some sector. 


Table 2 shows the maximum c,,(X, P) and minimum c,,(X, P) possible values 
of c(X, P), when vertex X is of each of the above 3 types. It will follow from the 
definition of the deflection number that 


y d(X)=1 (3) 


XGHN dP 


First, suppose that for each vertex X of P, our traverse of P always both 
approaches X and leaves X in the exact center of one of the 6 sectors of size 7/3 
shown in FiGureE 2. For this special case of P, the deflection number at each X 
agrees with the rotation number, takes a value from the discrete set {k/6|k = 
—2,—-1,..., +2}, and sums (over all X € HA OP) to 1 full rotation. Hence (3) 
holds. To show (3) for a general H-polygon P, note that for each edge (X,V) of 
P the angle between (X,V> and the center of the sector it enters at X is exactly 
the negative of the angle between (X,V) and the center of the sector which it 
leaves at V. Thus the sum of the angle deflections is the sum of the deflection 
numbers and (3) holds for general P. 
It is also clear that 


co= Le e(X,P)<c<s Yi ey(X,P) = cy (4) 
XEHN oP ; XeEHN aP 


‘ ' 


by the definition of the boundary characteristic. Define t;, to be the cardinality of 
the set {X € HM OP\dP leaves X in sector i, and X is of type j} for i = 1,2,...,7. 
Then b = Yieu.2,.... n&jeu,2,3;ti; and (3) may be rewritten (denoting U,t,; by ¢,) 
as 


6 = ~3t, — 2t, —t, +t, + 2t, + 3t, (3’) 
and the right side of (4) becomes 
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Using the above expressions for b, 6, and cy in terms of the ¢;,’s, it follows that 


6+b=cy + [ty + ts + te, + t7,] 


Using the left inequality of (4) in a similar way it follows that 


i 


This gives the desired inequalities. a 
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Logarithmetica Britannica. Being a Standard Table of Logarithms to Twenty Decimal 
Places. By Alexander John Thompson. Part V, Numbers 50000 to 60000 Issued by the 
Biometric Laboratory, University of London, to Commemorate the Tercentenary of 
Henry Briggs’ Publication of the Arithmetica Logarithmica, 1624. Subscription Issue. 
Cambridge, The University Press, 1931. 


This is the fifth part (the fourth not yet published) of this tremendous undertaking. 


It consists of twenty-place logarithms of numbers of five digits, accompanied by values 
of second and fourth differences. The project speaks for itself; it is sufficient to say 
that the result is all that is to be expected of any product of the Cambridge Press. 


—American Mathematical Monthly 
38, (1931) p. 407 
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NOTES 


Edited by: John Duncan 


A Simple Example on 
Non-Sequentialness 
in Topological Spaces 


Heinz Konig 


There are well-known examples of countable Hausdorff topological spaces which 
are not discrete but show certain typical features of discreteness: all compact 
subsets are finite, and therefore all convergent sequences are ultimately constant. 
These spaces are of interest in functional analysis and measure theory. The 
examples known to the present author are due to Arens [1] and Varadarajan [6]. 
This note wants to add a different example which is strikingly simple. It arose in [5] 
in connection with the double limit relation. 


THE NEW EXAMPLE. We fix a sequence (t,,) of real numbers t, > 0 such that 
t, 2 Oforn > wand Y*_,t, = » (for example t, = 1/n). We call a subset § CN 
small iff Ue st, < ©, which is to include § = ©. Thus all finite subsets of N are 
small, but there are also infinite small subsets: in fact, each infinite subset of N 
contains a small infinite subset. By means of the small subsets of N one then forms 
a topology on X := N U {}, with the open sets defined to be i) all subsets A CN, 
and ii) those subsets A C X with ©» € A whose complements A’ Cc N are small. It 
is obvious that this is a Hausdorff topology on X which is not discrete, and it is a 
simple verification that all compact subsets of X are finite. 

We turn to the two previous examples. Each time the above role of the small 
subsets of N will be assumed by some other set system o on N. 

The example of Arens (see also Kelley [4] Problem 2.E and Engelking [3] 
Example 1.6.20). We fix a sequence (X,,) of pairwise disjoint infinite subsets 
X, CN with union N. Then we define o to consist of the subsets § C N such that 
SO X,, is finite for almost all n. 

The example of Varadarajan (see also Berg-Christensen-Ressel [2] Exercise 
2.1.30). We define o to consist of the subsets S c N such that 


1 
—card(S M {1,...,n}) 70 forn > », 
n 


Thus each time we have a system o of subsets of N, intended to form the small 
subsets of N, with the properties 


1) o contains all finite subsets of N; 
2) S €o implies T € o for all TCS; 
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3) o is stable under finite unions; 
4) N is not a member of a; 
5) each infinite § Cc N contains an infinite T € o. 


In each of the two previous examples properties 1)—4) are obvious and 5) requires 
a little proof. In the new example all properties are obvious. 

By means of a set system o on N with the above properties 1)—5) one then 
forms a topology 7 on X := N U {}, with the open sets defined to be i) all subsets 
A CN, and ii) those subsets A CX with « € A whose complements 4’ c N are 
members of a. We collect the main consequences in the proposition below, the 
proof of which can be left to the reader as a sequence of simple exercises. We note 
that the last assertion is independent of condition 5) above. 


Proposition. 1) 7 is a Hausdorff topology on X which is not discrete. 2) Each 
compact subset (and even each relatively countably compact subset) of X is finite. 
Therefore each convergent sequence in X is ultimately constant. 3) t is completely 
normal: each nonvoid subset of X is normal in its relative topology. 


In particular the subset N CX has the cluster point ». Also the sequence (x,,) 
of the points x, =n has the cluster value ». But there is no sequence in N which 
converges to ©, 

Thus we have a common scheme for all the above examples. It is obvious that 
the new example is particularly simple. 

There is also the notorious non-constructive example: By Zorn’s lemma, each 
set system o on N with properties 1)—4) (for example the system of all finite 
subsets) is contained in a maximal such set system (in order to work with the usual 
notions of filters and ultrafilters one has to pass to complements). We claim that 
each set system o on N which is maximal with respect to properties 1)—4) also 
satisfies 5), and hence produces a topology 7 on X := N U {o} as above. To see 
this note first that for each T CN one has either JT € o or T’ € a. Now fix an 
infinite S CN. We write S =PUQ with disjoint infinite P,Q CS. In case 
P,Q €o then P’,Q’ €o and hence N = (PN Q)’ = P’ UQ ©g, which is not 
true. Thus we have a P € o or O & <a. This proves 5). 


The author wants to thank the referee and the Notes editor for good advice. 
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The Secant Method and the Golden 
Mean 


Melvin J. Maron and Robert J. Lopez 


The secant method is a well-known method for finding roots a of the equation 
f(x) = 0. Starting with two initial approximations of a, say x_, and Xj, the secant 
method generates 


uo. =x fF 4n) 4x 7 ¥e-1) 
kt1 XR ae? 
° f(%,) ~— f(%,-1) 
The rate at which the sequence {x,} converges to a root a depends on the 


multiplicity of a. Recall that a is a root of multiplicity m of the function f if f(x) 
can be written as 


k =0,1,2,... (1) 


f(x) =(x-—a)"$(x), where ¢ is bounded at a and ¢(a) #0. = (2) 


It is well known (see [2]) that if @ is simple root, that is, if m = 1, then there will 
be a nonzero asymptotic error constant C such that the errors 


Ex ~ A —~ Xz 


satisfy 


1 
= 1.618:::. 


lim ————~ = C, where p = 


Thus, secant method iterates will converge superlinearly with order p = t(V5 + 1) 
to simple roots a. Ancient Greek mathematicians attached profound significance 
to the numbers 


v5 -1 1 V5 +1 
r= =0.618--- and p=rt+1=—=—~— =10618---. (3) 


They referred to r as the golden mean because the ratio (1 — r):r equals r. 
Observe that r and —p are the roots of the quadratic equation 


x*+x-1=0. (4) 


The purpose of this paper is to prove the following result', which shows that the 
golden mean is also related to the way secant method approximants converge to 
double roots. 


Theorem. Suppose a is a root of f for which 


f(x) = (x—a)°d(x), where lim $(x) 0, (5) 


‘Part (a) was obtained for f(x) = x2(x — 1)? and a = 1 in [1] (Example E-11, p. 278). 
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and let &, = a —x,, where x, is the kth approximant generated by secant method 
(1). 


(a) If the sequence {x,} converges to a and lim, _,,{e,/é,_,) exists, then 


E v5 —1 
= T, where r= —5—— = 0.618 «+. 


lim 
kK Ex. 4 


(b) For x,_,, x, =~ a and p, = €,/é,_, ~ r, the ratios p, will satisfy 


(6) 


It follows that {p,} will converge to r once x,_, and x, are sufficiently close to a 
and p, is sufficiently close to r. 


Proof: (a) Since x, — x,_, = (x, — a) + (a — X,_,) = &_, — &,, we have from 

(1) 

e, + f(%4)(Ex-1 ~ €x) | 
f(%E) — F(%e-1) 

In view of (5), we may assume x,_, and x, to be sufficiently close to a so that «,, 


d(x,), and f(x,) = ez(x,) are all nonzero. Under this assumption, we can 
rearrange (7) to get 


— 


Expy ~ © 7X4 = 


(7) 


Ex41/Ex — 1 _ 1 
L—e,_y/e_ — f(%e-1)/f(%%) — 1? 
where f(x,_,)/f(x,) = 6(x,_,)/d(%,) - (e,_,/e,)”. Upon introducing the ratios 


Ex P(X,-1) 
d = 
EK-1 and Bs f( xX,) 


d(x) # 0, we get 


(8) 


Pp. = fork = 1,2,... 


in (8) and using the assumption lim 


xa 
Praia i _ Pk 

p,—1 7 Bi — Di 

So if lim, _,.. p, exists, it must be a solution of the equation (x — 1)(x* +x — 1) 

= 0, that is 1, r, or —p. Since lim, _,,, p, cannot be zero, {x,} cannot converge 


superlinearly to a; however, linear convergence requires |lim,_,,,p,|< 1. But 
lim, _,.. 9, cannot be 1 because if it were, then 


where jim PB, = 1. (9) 


Proiv i 


<1 
p, — 1 


IPx+1 — Il <lp, — ll, that is, 


would hold for infinitely many k’s in (9), whereas |p,/(B8, — pz)| > 1 would hold 
for sufficiently large k. This leaves r as the only possible asymptotic error 
constant. 

(b) To obtain (6), we first rewrite (9) as the finite difference equation 


Pr ~ Bx 


p = ——___, k =0,1,2,.... 10 
kei re (10) 
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and then subtract r to get 


, Per By 
Pr41 7 = 3B 7 
r pe — Bx 
Px ~ Pe — BCL — 1) 
pe ~ Be 
2 2 
pyr — p, + B,r 
= [1 -r=r?] 
B= Px 


Writing p, aS (p, — r) + r and collecting numerator terms gives 


5 pw SEZ r)r + (p_ —r)(2r? = 1) + (7? 1 + Br?) 
a a, 


Ba — Pi 
_ a apis —r)\(p,-1 +r(B,-V} [r?-1= -7] 
_ r(p, — 1) _y r? _ 
Bp (Pp, —1r) + B, ~ pn Pe 1). (11) 


Observe that if B — 1 and p — r, then 


re-V) ore) _ oar 
B — p’ 1-—r? 1+r 


[see (3)] and, similarly, r7/(B — p?) > r*/(. — r”) =r. It thus follows from (11) 
that 


Pray 7 = (—r? + 1) (Px —r) + (r+ p2)(B, - 1) 


where |,| and |w.,| can be made arbitrarily small by keeping |x, — a| and |p, — r| 
sufficiently small. This implies (6) and completes the proof of the theorem. O 


Traub [1, p. 278] states that secant method iterates will converge linearly to 
roots of any multiplicity m > 1. However, the authors are aware of no proof of this 
plausible assertion in [1] or elsewhere. 
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RK, Contains a Division Ring iff K Does 


Ayman Badawi 


INTRODUCTION. Let A be a ring with 1, and let R, denote the complete matrix 
ring of all » X n matrices over R under the usual matrix addition and multiplica- 
tion. Recall A,B € R, are similar iff there exists P © R, such that A = PBP™’. 
If A €R,, is similar over R to a diagonal matrix, then A is called [1] diagonable 
over R. For B € R,,, b;; denotes the entry of B in the ith row and jth column. 

In this note, we give an alternative proof of [1, Theorem 1] which is quite 
shorter than that in [1]. We would like to point out that our proof begins exactly 
like the original. 


Theorem ({1, Theorem 1]). Let R be a ring with 1 for which each idempotent matrix 
in R,, is diagonable over R. Then R contains a division ring if and only if R,, contains 
a division ring. 


Proof: If R contains a division ring, then clearly R, contains a division ring. 
Assume R,, contains a division ring K. The division ring K has an identity—call it 
J—and by the hypothesis PJP~'! = I a diagonal matrix for some invertible matrix 
PeER,,. Since the conjugation of R, by P induces a ring automorphism of R,, 
M = PKP“' is a division ring of R,, and has J as the identity. Hence J is a nonzero 
idempotent of R,. Let S = {A © M: A is diagonal}. Since J © S, S is not empty. 
We leave it to the reader to verify that S is a division subring of M. Since J # 0, 
there exists 1 <j <n such that 1,, is a nonzero idempotent of R. Let D = {a,;: 
A © S$}. Then D is a division ring of R with i,, as the identity. 

We end this note with some examples that satisfy the hypothesis of the Theorem 
and with one example where the hypothesis fails. Let R be a commutative ring 
with 1. Then R is called JD (basal) as in [7] ((2]) iff for every n >1 the 
idempotents of R,, are diagonizable. Foster [2] has shown that if R is a principal 
ideal domain, then R is JD. Seshadri [6] has shown that if R is a principal ideal 
domain, then R[x] is ID. In particular if F is a field, then F[x, y] is ID. Steger [7] 
has shown that if R is,an elementary division ring (i.e., for every n > 1 and 
A € R, there exist invertible matrices P,Q in R, such that PAQ is diagcnal) then 
R is ID. Also; Steger has shown that if R is a-regular ring (i.e., for everv x in R 
there exists n > 1 and y in R (n and y depending on x) such that x”yx” = x”) 
then R is JD. In particular for every m > 1 Z,, (i.e., Z/mZ) is ID (Foster has 
shown independently that Z,, is ID). 

Finally, Theorem 3 in [7] states that if R is JD, then every invertible ideal of R 
is principal. Thus if R is a Dedekind domain which is not principal, then R is not 
ID. In particular, let R = Z[V— 5] (Z is the set of all integers). Then R is a 
Dedekind domain, see [4, EX. 37, P. 70]. But R is not a unique factorization 
domain, for example 21 does not have unique factorization in R. Thus R is not 
principal and therefore it is not JD. 
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A Further Simplification of Dixon’s Proof 
of Cauchy’s Integral Theorem 


Peter A. Loeb 


The modification in [1] of Dixon’s proof of the Cauchy Integral Theorem and 
Formula is based on the proposition stated below. In this note we give a proof of 
that proposition which is more suitable for undergraduate students. In what 
follows, G will be an open set in the complex plane C, and y will be a closed 
rectifiable curve. We write f € H(G) if f is holomorphic, i.e. analytic, in G, and 
we use the notation D(z,r) for the disk {w © C: |w — z| < r}. The trace of y in C 
is denoted by {y}; we say the curve y is in G when {y} c G. 


~ 


Proposition. If y is a.curve in G, then for any z € {y} there is a closed curve o in G 
with z € {a} such that {,f = {,f for all f = H(G). 


Proof: We assume that there is a point £ # z with ¢ € {y}; otherwise the result is 
trivial. Pick r > 0 so that D(z,r) C G and ¢ €¢ D(z, r). We will assume that y is 
given by y(t) for t = [0,1] and y(O) = y(1) = ¢. By the uniform continuity of the 
mapping y, there is a natural number n such that if s, t © [0,1] and |t — s| < 
1/n, then |y(t) — y(s)| <r. Partition the interval [0,1] using the points 0 < 1/n 
< ++ <(n—1)/n <1. Let O= x) <x, <x, < +++ <x, =1 be the set of 
partition points k/n such that y(k /n) # z. If between adjacent points x; and x;,, 
there is a point of the form k/n or any other point t,) with y(t.) = z, then the 
path y(t), x; <t <x;,,, is in the disk D(z,r). In this case, we may replace the 
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map y on the interval [x,;, x;,,] with a path that goes from y(x,) to y(x,,,) in the 
set D(z,r) — {z}. By Cauchy’s integral theorem, applied to the disk D(z, r), this 
replacement does not change the value of the integral for any f © H(G). With 
these replacements, the new path o avoids z. O 


REFERENCE 


1. P. A. Loeb, A note on Dixon’s proof of Cauchy’s Integral Theorem, American Mathematical 
Monthly 98 (1991) 242-244. 


Department of Mathematics 
University of Illinois 

1409 West Green St. 
Urbana, Ill. 61801 
LOEB@MATH.UIUC.EDU 


Who Was the Author? 
On an extension of Sir John Wilson’s theorem to all numbers whatever, 
Philosophical Magazine, 1839. 
Note on elimination, Philosophical Magazine, 1840. 


Proof of the hitherto undemonstrated fundamental theorem of invariants, 
Philosophical Magazine, 1878. 


On a point in the theory of vulgar fraction, Amer. Jour. of Math, 1880. 
On Farey Series, Johns Hopkins University Circulars, 1883. 


Answer on page 697. 


1993] NOTES 681 


COMPUTER SCIENCE SAMPLER 


Edited by: Catherine C. McGeoch 


Zero-Knowledge Proofs 


Catherine C. McGeoch 


On a moonless night the spy returns to the castle after a reconnoitering mission to 
the enemy camp. As he nears the gate a voice whispers, “‘What’s the password?”’ 
But is it friend or foe who whispers? How can the spy show that he knows the 
password without actually revealing it to a possible imposter? 

The spy’s dilemma is commonplace now with the widespread use of telecommu- 
nications. When your automatic teller machine communicates with your bank, each 
must be assured that the other is legitimate; the electronic “‘passwords’”’ must be 
unforgeable and must be of no use to imposters and eavesdroppers. One method 
that has been proposed for exchanging passwords in this context is the zero-knowl- 
edge proof. 

Renaissance mathematicians developed their own primitive zero-knowledge 
proof systems. When both Tartaglia and Fior claimed knowledge of an algebraic 
solution to cubic equations, a contest was arranged in which each proposed thirty 
problems for the other to solve. In the end, Tartaglia had solved all thirty, thus 
providing a convincing demonstration that he knew the method without actually 
revealing it. Fior solved none. (It turned out that each had worked out solutions to 
certain classes of cubics, but neither had solved the general problem [2]). 

We’ve progressed considerably in formalizing this idea. An interactive protocol 
comprises two algorithms P (the prover) and V (the verifier) that read a common 
input string w of length |w| and then compute and communicate in alternating 
turns to determine whether w has some specified property. The verifier is 
polynomially-bounded: it must eventually halt and its total computation time must 
be bounded by a fixed polynomial in |w|. When it halts, the verifier outputs either 
accept or reject depending upon whether the property holds for w. The verifier is 
probabilistic, that is, allowed to make random choices during the computation 
according to the results of coin tosses. The prover is allowed to have unlimited 
computational power. 

A language & is a set of strings. An interactive proof system for 2 is an 
interactive protocol in which P helps V to decide whether w € .&. We require 
that with high probability the verifier be correct when accepting or rejecting the 
membership of w in .&. More precisely, for every constant c > 0, for sufficiently 
large w © ~ the probability (over all coin tosses) that V halts and accepts must be 
at least 1 — |w| °. If w &€& then we require that no prover P* be able to 
convince V otherwise: that is, for every c > 0 and large enough w, and for any 
interactive protocol (P*,V), V rejects with probability at least 1 — |w|~°. 
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In a zero-knowledge interactive proof system, whenever w € .&, P reveals no 
additional knowledge beyond the fact of membership. Informally, “no additional 
knowledge” means that the computational power of any verifier V* after partici- 
pating in the protocol is no more than what V* would have gained by simply 
assuming w © ”. 

To solve the spy’s dilemma we use an interactive zero-knowledge proof, choos- 
ing .2, w and a “secret” concerning w. An authentic P will transmit parts of the 
secret so that V accepts w © .&, but the transmission is otherwise useless to 
eavesdroppers and bogus verifiers. An imposter P* would, with high probability, 
cause V to reject w. 

Let 2, be “the set of strings that represent 3-colorable graphs” under some 
fixed graph-representation scheme. A graph is 3-colorable if its vertices can be 
assigned colors such that no adjacent vertices have the same color and no more 
than 3 colors are used. Let the set of colors be denoted @. Both P and V have 
access to an encryption function f: (@ x B) > ¢%, where & contains long strings 
of h and t values (representing long sequences of coin tosses) and @? is a set of 
encrypted colors. 

The common input string w, represents a particular 3-colorable graph G of n 
vertices and m edges. The secret, known only to P, is a correct 3-coloring of G; let 
c, denote the color of vertex i under this coloring. An interactive zero-knowledge 
proof that wg € 2, is sketched below. 


1. P applies a random permutation 7 to the colors: now each vertex 7 has 
color w(c,). Next, for each i = 1...n, the prover forms a random string r, 
from several coin tosses and computes cf = f(7(c,), r;). The encrypted 
vertex colors c/ are sent to V. 

2. V saves cj...c’ and then chooses two adjacent vertices x and y at random 
and sends them to P. 

3. P checks that (x, y) is really an edge in G. If not, the prover stops, having 
detected an imposter V that doesn’t know the protocol. If (x, y) is an edge, 
the prover sends the colors 7(c,) and 7(c,) and the values r, and r, to V. 

4. V computes c’, = f(w(c,),r,) and c', = f(a(c,), r,) and looks for inconsis- 
tencies with the transmission in Step 1, checking that c), = cy and c\, = c¥. 
The verifier also looks for violations of 3-colorability, checking that 
m(c,),m(c,) € C and that (c,) # a(c,). If any one of these checks fails, 
then V stops and rejects. 

5. If the checks all pass, then P and V begin again in Step 1. If m? iterations 
of this protocol are completed without rejection, then V halts and accepts 
Wo: 


Certainly the ‘above protocol represents an interactive proof system for zZ;. If 
Wo © 2, then V accepts with probability 1 after m? iterations. If we € 2, then 
the prover must send an invalid coloring (one with adjacent vertices the same color 
or one that uses more than 3 colors) in Step 1, which will be detected in Step 4 
with probability at least 1/m at each iteration. The probability that V halts and 
rejects after m? random probes is at least 1 — (1 — 1 /my”. A little calculation 
shows that this probability is sufficient (with the reasonable assumptions that 
|w| < 2m log, n and there are no isolated vertices in G). 

Indeed, for an interactive proof it would be sufficient for P simply to send the 
vertex colors to V; the extra steps are needed to ensure zero-knowledge. In Step 4, 
V “learns” that vertices x and y have different colors, which is no more than it 
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would learn from simply assuming wo € .2,. The verifier gains no additional 
knowledge over time because the colors are randomly permuted and encrypted at 
each iteration. Even after several probes V has no idea how to 3-color the graph. 
A formal proof of zero-knowledge is rather too long to go into here: see Goldreich 
et al. [4] [5] for details. 


BUT WILL IT WORK? Some nagging details must be addressed if this protocol is 
to be of any use to our spy. First, the proof of zero-knowledge depends upon an 
assumption that f(-,-) is a secure encryption scheme, in the sense that it is not 
feasible to decrypt by deriving each w(c;) from cf. Such a function is not known to 
exist. 

Second, we must be assured that only the legitimate prover P could know a 
correct 3-coloring of G. The problem of finding 3-colorings of arbitrary graphs is 
known to be NP-Hard: as a consequence it is widely believed (but not proven) that 
any algorithm for finding 3-colorings must use exponential time on some graphs. 
Suppose we could build G such that the coloring is known by construction, but 
finding it independently requires time exponential in the size of G. Then we could 
secretly tell the coloring to P and (by choosing G large enough) be assured that 
any impostor computer would need, say, 10,000 years to find a 3-coloring. We also 
would have settled the most important open question in complexity theory today by 
proving that a famous set of problems known as .V# is not equivalent to another 
set called #. 

So our zero-knowledge scheme is not provably secure. Even without this 
assurance, however, the method is efficient and reliable enough to have been 
applied in practice. There are several encryption functions that have not been 
broken. A well-known one, for example, encrypts by multiplying large primes and 
relies on the fact that there is no known way to factor large numbers efficiently. 
And we can take other steps to reduce the chance of compromising the protocol: 
there exist zero-knowledge proofs that do not require encryption functions at all, 
and there exist problems that are “harder” than 3-coloring to solve. 


FURTHER READING. The notions of interactive proof systems and knowledge 
complexity of proofs were first developed by Goldwasser et al. [6], [7]. Blum et al. 
[1] have since shown that zero-knowledge does not require interaction: that is, any 
interactive zero-knowledge proof can be replaced by one in which P sends 
messages to V but never receives any. 

Goldreich et al. [4], [5] give several examples of zero-knowledge proofs, includ- 
ing the 3-colorabjlity problem shown here, and discuss issues relating to secure 
protocols. Landau [8] describes what can happen when mathematicians and theo- 
retical computer scientists get involved with problems of interest to the Depart- 
ment of Defense. 

Technically, the zero-knowledge proof we’ve seen does reveal one bit of 
knowledge, namely that w € .a. Fiege, Fiat and Shamir [3] have exhibited proofs 
that are truly zero-knowledge: the prover proves that he knows whether or not 
w €.z, but doesn’t even reveal that fact. Perhaps someday we can extend 
zero-knowledge protocols to achieve a complete standstill of mathematical progress 
such as that attempted during the Renaissance. For example, maybe I could 
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demonstrate knowledge of the status of Fermat’s last theorem! without revealing 
the proof or even the truth or falsehood of the statement. Heaven forbid. 
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PROBLEMS FOR SOLUTION 


E. 36 Proposed by B. H. Brown, 
Dartmouth College. 
Show that the thirteenth of the month 


is more likely to be Friday than any 
one of the other days of the week. 


~ American Mathematical Monthly 
40, (1933) p. 295 


1 Please substitute “Goldbach’s Conjecture” for ‘‘Fermat’s Last Theorem”’ here. 
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eee 
PROBLEMS 


_ eee 


10322. Proposed by Jiang Huanxin, student, FuDan University, ShangHai, China 


Let ABCD and AEFG be squares with the common vertex A and different 
edge lengths. Let 0 = Z EAD (0 <6 < 7/2). Suppose that EF and CD intersect 
at the point P. For which value of 6 will AP be perpendicular to CF? 


10323. Proposed by David E. Penney and Carl Pomerance, University of Georgia, 
Athens, GA. 


For a natural number n, let t(n) be the sum of the divisors d of n in the range 
l1<d<n with n/d being squarefree. Is there an integer n for which the 
sequence n, t(n), t(t(n)),... is unbounded? 


10324. Proposed by William P. Wardlaw, United States Naval Academy, Annapolis, 
MD. 


Let a and m be positive integers and define the sequence (x,)> by x, = 1 and 
Xn+1 = a". Show that there is a positive integer N such that X, =x, (mod m) 
whenever N <h <k. 
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10325. Proposed by Broderick Oluyede, Georgia State University, Atlanta, GA. 


For i=1,2,...,r and j =1,2,...,c, let p,;, 20, and assume that 
vi-12j-12;,; = 1. Define p; .= LS_,p, , and p. ; = Lj_,p; ;. In addition, suppose 
that 


; ; 
Pi+,j+1 , yr Prog 2 yy Ph, j+i > Pi+1,k 
h=1k=1 = k=1 


for 0 <i<rand0O <j <r. Prove that 


y Py ee Em, Dow 


h=1k=1 


forO0 <i<rand0<j <r. 


10326. Proposed by Ira Gessel, Brandeis University, Waltham, MA. 


For r a positive integer, let K, be the smallest positive integer such that 


tt 
n+r\n 


is an integer for all n > 0. Show that 


10327. Proposed by Jerome Minkus, Berkeley, CA. 
Find the simple continued fraction for (e + 3)/4. 


10328. Proposed by A. Keith Austin, The University of Sheffield, Sheffield, England. 


Let A and B be sets such that AN B= © and A UB is the unit square 
(0, 1] x [0,1]. Prove or disprove the following: 

(a)* Either there is a continuous function f: [0,1] > A with f(0) = ©, y,) for 
some y, and f(1) = (1, y,) for some y,, or there is a continuous function g: 
[0,1] > B with g(0) = (x 9,0) for some x, and g(1) = (x,,1) for some x). 

(b) f and g as in part a cannot both exist. 


10329. Proposed by Gérard Letac, Université Paul Sabatier, Toulouse, France. 


Let f(x) is a positive continuous function defined for 0 < x < 1 such that, for 
all u with 0 < u < 1, one has f'f(x)f(u/x) dx = u'/’. Prove that 


2x 
f(x) = mix)’ 
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NOTES 


Notes: (10323) A related sequence, called the aliquot sequence of n is generated by 
using a function s(n) which is the sum of all divisors d of n in the interval 
1 <d <n. Some examples of aliquot sequences are: 9, 4, 3, 1, 0, 0,...; 6, 6,...; 
and 220, 284, 220,.... It is unkown whether all aliquot sequences are eventually 
periodic; the case of n = 276 is unresolved at this time. (10327) The standard 
reference for continued fractions is 0. Perron, Die Lehre von den Kettenbriichen. 
The fourth chapter describes the continued fraction for e and related “Hurwitz 
continued fractions’”’. 


SOLUTIONS 


Uniqueness from Asymptotic Behavior 


E 3449 [1991, 553]. Proposed by Mark A. Pinsky, Northwestern University, Evanston, 
IL. 


Suppose s is a continuous real-valued function on [0, +) such that s is 
differentiable on (0, +), 0 < s(t) < t’, and 


ds 
—=ft+yVt?-s (t>0). 


dt 
Prove that s is unique and obtain a closed formula for s. 


Solution I by J. B. Thoo, student, University of California, Davis, CA. By direct 
substitution it is easily verified that s(t) = 2t* satisfies the requirements. 

To establish that this s(t) is the unique solution, we will show that any two 
solutions s,(t) and s,(t) must be identical. Let us define g(t) = (s,(t) — s,(t))’, 
which is clearly non-negative. Then for all t > 0, 


g'(t) = 2(5(t) — s2(t))(si(t) — 54(t)) 
_ (s(t) - s,(t)) 
(1° — s,(t)) + (t? - s,(t))" 


< 0. 
Hence, g(t) < g(0) for all t > 0. But since 0 < s(t) < t* implies s,(0) = 0 and 
s,(0) = 0, then also g(0) = 0; hence, for all t > 0, g(t) < 0. Since g is both a 
non-negative function and, it now appears, a non-positive one as well, it must be 
identically zero, and so therefore s, = s,, as claimed. 


Solution IT by Frédéric Brulois, California State University-Dominguez Hills, 
Carson, CA. Re-write the given condition in the form s’(t) — t = (t? — s(t))'”. 
Square it and differentiate it to obtain 2(s’(t) — t)s’(t) = s’(t). This is a first-order 
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homogeneous equation in s’(t), which can be solved by standard techniques to 

obtain s’’(t)3t — 2s'(t)) = C. Thus, using the parameter p = s’(t), we get t = 

(2/3)p + (C/3)p 2 and s = (1/3)p* + (2C/3)p—'. Since s’(t) lies between ¢ and 

2t for all ¢ > 0, the only possible value of C that would permit this equation to 

hold for arbitrarily small ¢ is 0. It follows from the parametric solution that 
= 2p/3 and s = p*/3. Thus s(t) = 3t7/4. 


Solution II by Kiran S. Kedlaya, student, Harvard University, Cambridge, MA. 
Define f(x) = $1 + G@ — x)'/?), and note that x = 3/4 is a fixed point of this 
function. Also note that |f’(x)| < 7/8 for 1/2 <x < 7/8 and that this interval is 
taken into itself by f. Furthermore, any sequence defined inductively by x,., = 
f(x,,), with x, € [0,1], eventually enters this attracting basin and converges to 3/4. 
In particular, we choose x, = 0. 

We prove by induction that x,,t* < s(t) < x,,,,t* for all ¢ and all n. From this 
and the fact that x, > 3/4, we may conclude that 31° < s(t) < 31°, and thence 
that s(t) = 347. 

The statement with n = 0 is hypothesized, so let us show that if x,,t? < s(t) < 
X5,4, ,t7 holds for some k > 0, then x,,,,5t7 < s(t) < x»,,3t7 holds also. We have 


21X5449 = f+ t* —x5,,,¢° <ftt t-—s <fct+ Vt? — X,t7 = 2X 5444 


where the equalities follow from the inductive definition of x, and the inequalities 
follow from the induction hypothesis. Then by integrating, we obtain 


f2terx42 dt < fs) at < eee dt 
0 0 0 


2 2 
Xopggt” SS(t) S Xgl, 


where we have used the fact that s(0) = 0. Then, by similar reasoning, we reach 
the desired conclusion that 


2 2 
Xops2t” S S(t) <Xp,430°. 


Editorial comment. These three solutions are representatives of the principal 
methods of solution. These may be summarized as follows. 

Method I: Guess the answer. Prove that it works. Then give careful attention to 
proving uniqueness. 

Method II: Transform the differential equation by a change of variable or 
further differentiation into an equation whose complete solution can be found by 
standard methods. Then impose the restriction that 0 < s(t) < t’. 

Method III: Use the differential equation to iteratively produce explicit refine- 
ments of the requirement that 0 < s(t) < ¢? for all t > 0. Existence and unique- 
ness will then follow from general fixed-point arguments. In this method, a familiar 
method of proof of the existence and uniqueness theorem of differential equations 
is applied, exploiting a global inequality on the solution to control /js’(t) dt. 


Solved by 55 readers, some submitting more than one solution, and the proposer. This yielded 21 
solutions by Method I, 19 by Method II, 16 by Method III, and 4 hybrids. In addition there were 7 
submissions found to be incomplete or inaccurate. 


Graceful Permutations 


E 3455 [1991, 646]. Proposed by D. G. Rogers, University of Aberdeen, Scotland, 
and Howard University, Washington, DC. 
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It is known that if n = 0 (mod 4) or n = 1 (mod 4), then there exist permuta- 
tions (x,, x5,..., x,) of (1, 2,...,) such that the differences |x, —k|,1<k <n, 
are all distinct. (Cf. E 3269 [1988, 554; 1989, 843].) Prove that the number of such 
permutations is a multiple of 4. 


Solution by M. Roth and O. Such, Queen’s University, Kingston, Ontario, 
Canada. Let S be the set of permutations such that the specified differences are 
all distinct. Let n > 1 to assure that the identity does not belong to S. We show 
that the number of such permutations is a multiple of 4 by defining two involutions 
a and p on S such that wp(o) = pr(a) and wo) # p(o) for any o € S. If we 
also show that 7 and p do not fix any element of S then the action of these 
operations splits S into disjoint orbits of size 4, which proves || = 0 (mod 4). 

Letting o =x,,...,xX,, Tio) =y,,...,y, and plo) =2Z,,...,zZ,, we define 
m(o) and p(a) explicitly by y, =j if and only if x, =k, and z, =j if and only if 
Xn41-4 =" +1 —jJ. By construction, these produce permutations, preserve the set 
of differences, and are involutions. Note that 7 takes a permutation to its inverse. 
An element of S cannot interchange a pair of points, and can have at most one 
fixed point, so a fixes no element of S. If p(o) =o, then x, =j if and only if 
Xn41-4 =" +1-—J, in which case the differences for positions k andn+1—k 
have the same magnitude and o € S. 

By direct calculation, both wp(o) and pw(o) have n+1-—k in position 
n+ 1-—j if and only if x, = Jj, so 7p = pz. All that remains to be shown is that 
these maps are not equal on any member of S. Note that p(a7) hasn + 1 —x,,,_, 
in position k. If 7(a) = p(@), then o also has k in position n + 1 — x,,,,_,. This 
makes the absolute difference between a position and its value the same at 
position n + 1—k and position n+ 1-—x,,,_,. If o © S, the differences are 
distinct for distinct positions, and hence k = x,,,,_, for all k. This is satisfied only 
by permutation which is not in S. 


Editorial comment. Only a few solvers made explicit mention of the fact that the 
statement of the problem needed to be modified to require n > 1. The term 
“sraceful” for the permutations with this property was suggested by Albert 
Nijenhuis. 


Solved also by D. Callan, R. J. Chapman (U.K.), P. CiZek (student, Czech Republic), M. Dindos 
(Slovakia), J. Fukuta (Japan), L. L. Gardner, R. High, A. A. Jagers (The Netherlands), I. Kastanas, 
K. S. Kedlaya (student), O. P. Lossers (The Netherlands), J. H. Nieto (Venezuela), A. Nijenhuis, J. H. 
Steelman, C. Voas, National Security Agency Problems Group, Shreveport Problem Solving Group 
(LSU), and the proposer. 


A Variant of the Erdés-Faber-Lovasz Conjecture 


6664 [1991, 655]. Proposed by Paul Erdos, Hungarian Academy of Sciences, 
Budapest 


Let G be a graph whose edges can be covered by n complete subgraphs with n 
vertices each (i.e., G is the union of n copies of K,,, with no restrictions on shared 
vertices). 

(a) Prove that the chromatic number of G is less than 1 + nv¥n — 1. 

(b) Prove that this bound is asymptotically best possible, i.e., if f(m) is the 
maximum chromatic number of a graph constructed in this way, then f(n) = 
{1 + o)}n?”. 
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Solution of (a) by Ilias Kastanas, California State University, Los Angeles, CA, 
and by Charles Vanden Eynden, Illinois State University, Normal, IL (indepen- 
dently ). A graph with chromatic number k has at least (‘] edges, because if there 


is no edge between the set of vertices of color i and the set of vertices of color j, 
then colors i and j can be combined into a single color. On the other hand, G 


has at most n(”| edges, and the inequality k(k — 1) < n*(n — 1)impliesk <1+n 


vn — 1. 


Solution of (b) by Richard Holzsager, American University, Washington, DC. 
Consider the affine plane of order p, where p is a prime. There are p* points and 
p’ + : lines of size p, such that each pair of points appears in a unique line. If we 
view these lines as cliques (complete graphs) on the points, then we have expressed 
the complete graph K,> as a union of p’ + p cliques of size p. We now expand 
each point into a clique of p + 1 points to express K,3,,2 as a union of p’+p 
cliques of size p? + p. Hence f(p? + p) > p? + p? = (1 + 01) p? + p)?””. 

Now, let n be arbitrary, and fix e« > 0. By the prime number theorem, the 
number of primes below x is eventually greater than the number of primes 
less than (1 — ¢)x, meaning there are primes between (1 — «)x and x, if x 
is large enough. Taking n large enough and x = yn + 1/4 — 1/2, we obtain 
a prime p with (1 — 2e)n <p(p+1)<n. Therefore, f(n) > f(p? + p)= 
(1 + o(1)) p? + p)?”* = 1 + o(1))n?”. 


Editorial comment. This problem was first received from the proposer by the 
editors in 1987. In 1988, P. Horak heard of the problem and found a solution, 
which was published as “A coloring problem related to the Erddés-Faber-Lovasz 
conjecture,” J. Combinatorial Theory, Ser. B 50 (1990), 321-322. 

Suppose we add the constraint that each edge of G appears in exactly one 
clique (note that this is violated by the construction in (b)). The Erdés-Faber-Lovasz 
conjecture is that in this case the chromatic number is exactly n. 

Zoltan Fiiredi improved the upper bound of (a) when n is of the form q? + q. 
He proved that g* + q? is an upper bound in this case, which makes the projective 
plane construction of (b) optimal when n = qg* + q and q is a prime power. 


The problem was completely solved by all three solvers cited above. Solutions were also given by the 
proposer and by Z. Furedi. 


More Pigeons on the Circle 


E 3463 [1991, 852]. Proposed by Donald E. Knuth, Stanford University, Stanford, 
CA. 


Let S be a set of m distinct points on the unit circle such that no two are 
diametrically opposite. For a fixed integer n < m/2, suppose that we mark every 
point p in S such that fewer than n of the remaining points in S lie in the 
semicircle counterclockwise from p. Prove that at most n points are marked. 


Solution by John H. Lindsey II, Fort Myers, FL. Fix n and delete unmarked 
points one by one, each time allowing unmarked points to become marked as the 
semicircles empty out, until only 2m points remain or all remaining points are 
marked. At this time a point is marked only if the nth point later is more than a 
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semicircle away. If that point is also marked, then traversing 2n points travels 
more than the full circle, which happens only if fewer than 2n points remain. 
Hence the stopping condition occurs when exactly 2m points remain, and the 
marked points consist of exactly one point from each pair of points separated by 
n — 1 points in each direction. This implies there were at most m marked points in 
the original set. 


Solved by 26 other readers and the proposer. 
A Half Step Towards Carmichael’s Conjecture 
6671 [1991, 862]. Proposed by Carl Pomerance, University of Georgia, Athens, GA. 


Let V(x) denote the number of distinct values not exceeding x taken on by 
Euler’s arithmetical function ¢. Let V*(x) denote the number of these values with 
a unique pre-image. For example, V(15) = 7, V*(15) = 0. 

R. D. Carmichael conjectured that V*(x) = 0 for all x. Prove the weaker 
assertion that liminf, ,,{(V*(x)/V(x)} < 1. 


Solution by L. E. Mattics, University of South Alabama, Mobile, AL. Let 
e = 2'/* — 1 and let V,(x) be the number of values of }(w) not exceeding x such 
that w is odd. We will show that we can prove that lim inf, ,,. V*(x)/V(x) < 27'/” 
regardless of whether or not the following proposition holds. 


Proposition. For every positive integer N there is an x => N such that V,(x/2) > 
eV(x). 


If the proposition does hold then there are arbitrarily large x such that 


(5 >V, (5 + v*(> > eV(x) + v*(>| >(et+ nye(5 > 2veve( =| 
2) = 2 2)=* 2)—%* 2) = 2 
so liminf, ,,, V*(x)/V(x) < 271”. 

Assume from now on that the proposition does not hold. Then there is an 
integer N such that for all x > N, V,(x/2) < eV(x). If (2,u) = 1 and 6(2%u) < x 
has only one pre-image, then a > 2 and g(u) <x/2°7!; and if v is odd and 
d(u) = d(v), then (2%u) = d(2%v), so u=v. This implies that V*(x) < 
uo” _ V(x /22—') where V,(c) = O if c < 1. 

Now let m = |log,(x/N)| + 1 then 


[o.@) 


V*(x)'< Der) + > Vo| =| 


a=m+1 


E N 

< 72.) + V(N) + rol =| + ese f, 

Since V(x) > © as x > © and N is fixed we have liminf, ,,. V*(x)/V(x) < 
2-1/2, 


Editorial comment. All solutions provided an upper bound on the quantity 
lim inf, _,,.{V*(x)/V(x)}. The best value obtained to date was 1/2, which was 
given by the proposer. All solutions considered the set ®, = {m: m = (2k + 1) 
for some k}, and noted that, if m is (nm) for some n, then there is an integer h 
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with m/2" € ®,. This should be compared to problem E 3661 [1990, 63; 1991, 
443] in which examples of 6(n) ¢ ®, were given. 


Solved also by I. Kastanas and the proposer. 


Consecutive Convergents 


10187 [1992, 60]. Proposed by Irving Adler, North Bennington, VT. 


Suppose n,_,/d,_, and n,/d, are consecutive convergents of the simple 
continued fraction for some real number a in (0, 1). Assume you are given only the 
values of d,_, and d,. Construct an algorithm for determining the values of n,_, 
and n,. 


Solution by Nicholas C. Singer, Annandale, VA. The problem as stated has two 
possible solutions, because the convergents to a and 1 — a have essentially the 
same sequence of denominators. That is, if 0 <a< 1/2, then a has the con- 
tinued fraction expansion a = [0,4a,,a5,a3,...] with a, => 2; and then 1 —a = 
[0,1,a, — 1, a,,a3,...]. Using the standard recurrence relation 


d,(a) = a,d,_(a) + d,_2(a), d_,(a) = 1,d_,(a) = 0, 


we conclude that for k => 1, d,Q — a) =d,_ (a). We need exactly one bit of 
additional information to get a unique answer: (i) is a greater than or less than 
1/2? or Gi) what is the parity of k? 

It is immediate, using the recurrence relations, that d,/d,_, = 
[a,,@,_1,4,_>,-+-,4,]. The quotients and convergents are calculated using the 
usual continued fraction (which is equivalent to the Euclidean algorithm). The n, 
satisfy the same recurrence as the d, with the initial conditions replaced by 
n_,=0, n_, =1. In addition, npg =a), =0 so we also have n,/n,_, = 
[a,, 4,1, 4,_9,-+++, 5]. (The case k = 1 is special since n, = 0.) That is, we take 
n, and n,_, to be the numerator and denominator of the penultimate convergent 
of the continued fraction expansion of d,/d,_. 

The usual application of the Euclidean algorithm always gives a, => 2. However, 


[ap, Ap 1) Ap _9>+++) Az, 4,] = [G,, A,_ 1, A,_9,..-, 5, a, — 1,1], which leads to the 
alternative expansion n,/n,_, = [a,, @,_1,4,_2,+++,@>,4, — 1]. This corre- 
sponds to 1 — a = [0,1,a, — 1, a5,...,a,,...] > 1/2. This expansion of n,/n,_, 


has k convergents, whereas the previous expansion had k — 1. Hence knowing the 
answer to either question (i) or question (ii) allows us to produce the unique 
correct result. 


Solved also by D. Callan, R. J. Chapman (U. K.), D. Chinitz (student), C. H. Ebersole, B. Haible 
(Germany), R. J. Hendel, R. High, O. P. Lossers (The Netherlands), A. Nijenhuis, J. H. Steelman, R. 
Stong, B. M. M. de Weger (The Netherlands), E. A. Weinstein, O. Wyler, Anchorage Math Solutions 
Group, National Security Agency Problems Group, and the proposer. 


Complex Conjugation of C(z) 


10191 [1992, 61]. Proposed by Dragomir Z. Dokovié, University of Waterloo, 
Waterloo, Ontario, Canada. 


Let G be the group of C-automorphisms of the function field C(z) and © the 
set of involutory automorphisms of C(z) which extend the complex conjugation on 
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C. Show that ¥ splits into two orbits under the action G X L > ¥X, (a, B) 
ao Boa_!. (Thus there are only two essentially different ways of extending the 
complex conjugation to an involutory automorphism of C(z).) 


Solution by Robin J. Chapman, University of Exeter, Exeter, U.K. It is well 
known that if a © G then a is determined by a(z) = (az + b)/(cz + d) where 


A= (: +} is a non-singular matrix over C. Two different choices of A give the 


same a if and only if they are scalar multiples of each other. Also, composition in 
G corresponds to matrix multiplication. Furthermore, an automorphism B of C(z), 
restricting to conjugation on C, is also determined by B(z) = (pz + q)/(rz +s) 
where B = (? 4 is a non-singular matrix over C; and again, two different B give 
the same # if and only if they are scalar multiples. Now we easily compute that 
B*(z) = (tz + u)/(vz + w) where C = (: u)= BB. Hence B € & if and only if 
BB = AI for some AEC. If B €¥ then A? = det Bdet B = |det B|*. Hence 
A = +|det B|. As replacing B by wB changes A to |wl|*A and |det B] to 


|u|?|det B| then the sign of A is an invariant of B. If B = ( .° | then BB = +I 


and so both signs occur. Call B positive or negative according to the sign of A. 
I claim now that if a € G and B € ¥ then a Boa ‘' is positive if and only if 
B is. If @ and B are represented by the matrices A and B_ respectively 
then acBeoa! is represented by B’ = A~'BA. Now B’B' = A~!BAA~'BA = 
A~'BBA = AI and the claim follows. Hence © splits into at least two G-orbits. 
Take B € LX. We may represent B by a matrix B with det B = —sgn(B). Hence 
B= —(det B)B™'! and B= (< ° where a € C,b € R,c € Rand |a|* + be = 


0 


— 


+1. Now if |] = 1 and A, = (; | then B’ = A= 'BA, _ [« | so that by a 


suitable choice of £ we may assume that B’ = ( a ») where a’ € R and det B’ = 
+1. Hence B’ has characteristic polynomial X* + 1 and so there is a real matrix 
A, with Az'B'A, = (° | if B is positive and A;'B'A, = (_ 0 | if B is negative. 


1 O 1 O 
Hence if A = A, A, then A~'BA = (,’ ‘| and so there are at most two G-orbits 
in ). 


Editorial comment. Robin Chapman also provided a cohomological interpreta- 
tion of the result. If we let T = Gal(C/R), then T acts on G = PGL(C) by 
conjugation and it is not hard to see that the elements of / correspond to 
1-cocylces of I in G and that two elements of ) correspond to cohomologous 
cocycles if and only if they lie in the same G-orbit. Hence the set of orbits 
corresponds to the set H 1(T, G). Using the exact sequence of I'-modules 


1 > C* > GL,(C) > PGL,(C) > 1 


a standard theorem of Galois cohomology (Jean-Pierre Serre, Local Fields, 
Springer-Verlag, 1978, X.5), shows that the connecting map 6: H'\(T,G) > 
H?(l, C*) is an isomorphism. Now as [ is cyclic, it is immediate that H?(T, C*) = 
R* /N(C*) = {+1} where N: C — R is the norm map, and the result follows. More 
generally if we replace RK and C by K and L where L/K is a quadratic extension 
with Galois group I then the corresponding result is that the G-orbits of L are in 
one-to-one correspondence with H'\(T,G) = H7(T, L*) = K*/N, ,x(L*). 

Now H°(T, L*) is the relative Brauer group Br(L/K), and as (L K| = 2 this 
can be interpreted as the set of equivalence classes of 1-dimensional Severi-Brauer 
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varieties over K split by L. These are the projective curves defined over K which 
become isomorphic to the projective line after base change to L. If B € »& then 
the fixed field L(z)* is the function field of the corresponding Severi-Brauer 
variety. If L = C and K = R and if B is positive, then C(z)* = R(t), the function 
field of the projective line over R; and if B is negative, then C(z)* = R(x, y|x? + 
y*7+1=0), the function field of the conic C with homogeneous equation 
X? + X? + X? = 0 which has no points defined over R. Explicitly, if B(z) = z, 
then B is positive and C(z)* = R(z); while if B(z) = —1/z, then B is negative 
and C(z)® = R(x, y) where x = (z — 1/z)/2 and y = i(z + 1/z)/2 satisfy x? + 
y7= —-1. 

The proposer’s proof that there are at most two orbits involved showing that any 
matrix B with BB =I can be written as A~‘4, for which he referred to D. Z. 
Dokovié, “On some representations of matrices”, Linear and Multilinear Algebra, 4 
(1976), 33-40. 


Solved also by D. Callan and the proposer. 


Collaborating editors: David F. Appleyard, Paul T. Bateman, Bruce C. Berndt, 
Duane M. Broline, Barry W. Brunson, Frank S. Cater, Gulbank D. Chakerian, 
Underwood Dudley, Gerald A. Edgar, Michael A. Filaseta, Ira M. Gessel, Richard 
A. Gibbs, Jerrold R. Griggs, Douglas A. Hensley, John R. Isbell, Mourad E. H. 
Ismail, Murray Klamkin, Daniel J. Kleitman, Frederick W. Luttmann, Frank B. 
Miles, Richard Pfiefer, Stephen L. Portnoy, J. O. Shallit, John Henry Steelman, 
Kenneth B. Stolarsky, David E. Tepper, Douglas B. Tyler, Daniel Ullman, and 
William E. Watkins. 


“You know, for a mathematician he 
did not have enough imagination. But 
he has become a poet and now he is 
doing fine 


—Hilbert (to Cassirer, 
about a former student) 


Answer to Picture Puzzle 
(p. 661) 
George Pélya. 


Answer to Who Was the Author 
(p. 681) 


James Joseph Sylvester. 
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Thomas Archer Hirst— 
Mathematician Xtravagant 
IV. Queenwood, France and Italy 


J. Helen Gardner and Robin J. Wilson 


I occupied myself most of the day by sketching out a kind of inaugural lecture for Queenwood. It 
is now certain that in August I shall commence my life as Tutor there. It has for me its 
attractions and at the same time its onerous duties and responsibilities—I meet both cheerfully, 
and I hope for strength and courage to fulfil the task I have chosen for myself in the world. 


Thomas Hirst returned to England in mid-summer 1853. At a time when there 
were limited job opportunities for a young mathematician, he was lucky to be 
offered a teaching post at Queenwood College, near Salisbury, where John Tyndall 
and his chemist friend Edward Frankland had taught before their Marburg days. 


Queenwood College 


Queenwood itself is a beautiful spot, it stands in a rich undulated chalk district, the small knowls 
and vallies are always graceful and smooth and the rich woods, with their beautiful beeches, yews 
and elms have a soothing effect. The building itself is interesting on many accounts, first, its 
architecture which is in a novel and picturesque style, mostly Italian. Secondly, its inward 
arrangements, which are the most convenient and beautiful that I have seen, and thirdly its 
associations, for this is the celebrated Harmony Hall, where the socialists first practically tried to 
live by the law of love, and of course miserably failed... now it makes one of the most beautiful 
schools I ever saw, and from all accounts the scholastic arrangements are just as good. 


“owe « om mt oe a 


1993] THOMAS ARCHER HIRST 723 


Queénwood was partly boys’ elementary school, partly mechanics’ institute—a 
sort of technical college. There was a strong Pestolozzian influence, in that the 
teaching emphasized practical work by the pupils. For example, Hirst taught 
geometry in the context of surveying, rather than as theorems from Euclid. This 
experience was to prove useful later when he emerged as a reformer of geometry 
teaching in schools. 


14th August 1853: We have now got thoroughly to work. I have 13 hours a week teaching, and 
two lectures; and I get more and more to love my work. The profession of schoolmaster is no 
drudgery, but when properly undertaken a noble task, and a healthy discipline. Yes, I have come 
to the conclusion that I have found my proper task, and to the determination to fulfil it to the 
best of my ability. At present it occupies nearly all my time, and must do until I am thoroughly 
master of my best plan for tuition. That done, then I sit down to my own investigations. 


He was also developing a reputation for giving public lectures on physics. 


30th October 1853: ...I have now conquered to a great extent all nervousness—it would be no 
task to speak to a thousand people about a subject with which I was well acquainted. 
Nevertheless, I have not yet got the tact to know what parts will be best appreciated. I found 
afterwards that exactly the parts on which I had laid least stress were best appreciated, and vice 
versa. This talent which I lack is essentially necessary to a popular lecturer... 


When he could, he escaped to London to see Tyndall and to attend popular 
science lectures at the Royal Institution, where Tyndall was now working. 


21st January 1854: Friday evening I attended Faraday’s lecture—‘“‘Electricity of Induction static 
and dynamic effects.” The lecture room was filled with a very brilliant audience, and the lecture 


itself pleased me much... Perhaps the lecturer’s manner, person and celebrity attracted me 
most. There was about him such a total absence of mannerism and pretension, such geniality and 
gentleness... 


As John and I were sitting writing, after tea this evening, Faraday himself paid us a brief 
visit... The room smelt villainously of tobacco, although John hurriedly scattered some eau de 
Cologne on the carpet. The candles too just went out and Faraday made his entrance almost in 
the dark. After a short time, during which Faraday had bowed to me, John remarked that I was a 
friend of his. ““Oh, indeed!” says Faraday, fetching a chair, ‘‘well, let us all sit down, and have a 
look at one another.” He did sit down, and after looking at me for a minute got up and shook 
me very kindly by the hand, saying it was a pleasure to him to know any of Tyndall’s friends... 


Hirst’s attitude to women was somewhat prudish and patronizing. While in 
Marburg, he had struck up an acquaintance with a young lady called Anna Martin. 


3rd July 1854: ...Instinctively I got to admire her, her artlessness, her affection for her own 
family, her honest independence, and even waywardness towards me, and finally her frank 
friendship for me in spite of all my bluntness and scolding—all these things, no doubt, besides 
many others, drew her nearer to me... 


They were married late in 1854 at Anna’s home in County Down, and returned to 
Queenwood after spending their honeymoon in Paris. Hirst was blissfully happy. 


18th February 1855: ...I] have convinced myself that she is and will be a true and devoted 
companion. There is in her a far deeper devotedness than I could have anticipated... I have 
found that her happiness consists not in comforts and luxuries, but rests on the far higher and 
more womanly consciousness that she is necessary to her husband’s happiness... Her failings, as 
failings of course she has, I can trace almost entirely to her irregular life and training... yet 
when I do see her bustling about in her own cheerful, merry way I forget her inertia and 
consider her the best little housewife in Christendom... Let me close the passage by thanking 
God for her, and expressing the ever stronger determination to guard and cherish her for ever. 
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Married life clearly suited him, and it was also a successful time for his mathemati- 
cal researches. 


10th February 1856: ...I have succeeded in establishing a very general and very interesting 
theorem with respect to the surfaces which equally attract a given point. I hope before 
midsummer to have a very pretty investigation ready for publication... 


But it was all too good to last. Shortly after their wedding, Anna began to show 
signs of advancing tuberculosis. The symptoms became increasingly worse, and 
Hirst eventually resigned his job at Queenwood to devote himself to her. From 
1856-1857 they travelled in the South of France, vainly searching for a cure. While 
there, he wrote two papers arising out of his earlier work at Gottingen with Gauss 
and Weber, and these were published in the Philosophical Magazine. 

At the same time, his mathematical reading continued to be extensive and 
intelligent. Even if a work was badly written, he would persevere with it because 
the subject itself mattered to him. William Rowan Hamilton’s work on quater- 
nions, Carl Jacobi’s Elliptical functions (in Latin), and Sartorius von Walter- 
shausen’s Life of Gauss were among the works on which he commented, often 
critically: 


14th September 1856: For the last week I have been studying Spottiswoode on Determinants in 
Crelle’s Journal. It is obscurely written and badly printed, and hence very laborious to 
understand; but as I am determined to master the subject, I shall spare no pains... 


18th January 1857: ...I have purchased too an admirable work of Euler’s, namely his Letters to 
a German Princess on subjects connected principally with Physics. The most unscientific person 
could understand them, they are written with wonderful clearness. I wonder a good translation 
of them has never been used as reading lessons in our schools. His subjects are not so 
elementary; it is the lucid style that deceives one into the belief that the subject is simple. 
Therein consists an infallible sign of an able writer... 


Eventually they settled in Paris, where he made the acquaintance of a curious old 
fellow... 


21st June 1857: He is at the same time door-keeper, boot and shoe maker, and mathematician!!!! 
Like most self-educated men, he is extremely opinionated and almost a monomaniac. Neverthe- 
less he is an original and altogether a remarkable shoemaker. He takes great delight in giving me 
problems to solve, and is disappointed when I solve them correctly. At present he pronounces 
my solution of the following problem to be incorrect: “A man borrows 300 francs, for which he is 
to pay interest at the rate of 5 per cent per ann. If he pays 20 francs a month instead of the 
interest which is really due, how soon will he have repaid the sum borrowed?” ... 


In July the inevitable happened. Anna died, leaving Hirst devastated: 


2nd July 1857: Poor Anna suffers no more, she is at peace for ever. Formerly she read my 
journal and I had always to write accordingly, to leave all my anxieties and fears unexpressed. 
Now she will read no more... 


John Tyndall was on his way to Switzerland to study the structure and movement 
of glaciers, when he received intelligence of the calamity. He took Hirst with him 
to Switzerland, and the two became even closer friends. In August 1857 they were 
joined by their friend Thomas Huxley and made one of the earliest ascents of 
Mont Blanc. 
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Hirst never fully got over the tragedy of Anna’s death, and paid regular visits to 
her Paris grave for the rest of his life. Deciding that the time had come to devote 
himself entirely to research, and perhaps wishing to remain near Anna, he settled 
himself in Paris. 


18th October 1857: My health has continued on the whole good, and I have worked very steadily 
all the week. Still my progress does not satisfy me. When I consider that I have been nearly two 
months engaged on a small geometrical research which is a little out of my direct line I feel 
inclined to lose patience. But I must not. I have commenced a subject and I will give it some 
kind of finish before I pass to another... 
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Collége de France 


As he became more involved in his researches, he resumed his practice of paying 
visits to mathematicians. The foremost French mathematicians at this time were 
Joséph Liduville and Joseph Bertrand at the Collége de France, Michel Chasles at 
the Sorbonne, and the retired Louis Poinsot. 


18th November 1857: On Saturday last I paid M. Liouville a visit. It is long since I first 
entertained the idea of this visit... He is a pleasant, chatty little man with whom I soon felt at 
perfect ease. The only blemish I observed in him was an occasional unmeaning giggle. We talked 
of Dirichlet, of Steiner, of Poinsot, of Cayley and of Sylvester, in the chattiest, frankest manner. 
His remarks on all these men were shrewd and just. I coincided entirely. And I must confess | 
heard with some satisfaction his remarks on Cayley’s productions. He acknowledged their ability 
but he protested against their wilful obscurity. He considers Cayley and Sylvester to be in some 
measure the disciples of Cauchy in this respect... To be precise and clear is equivalent in their 
eyes to being tedious. Rather than march over their difficulties and through their conquered 
territory with a firm, steady step, they leap and turn somersaults. It is possible that by so doing 
they are able to take a rapid and sufficient view of their subject, but others decidedly see better 
with their head upwards... 
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I went to hear Chasles’ first lecture on Geometry, and was far from satisfied with it. Perhaps 
he was in bad humour—certainly he did not enter with his whole might into his subject. He 
hesitated and bungled much, and altogether his lecture formed a sad contrast to his books which 
are remarkably clearly written. But even his books are not to be compared to Steiner’s in grasp 
of his subject... 


Joseph Liouville (1809-1882) Michel Chasles (1793-1880) 


Much of Hirst’s time was spent in translating important mathematical works into 
English. One such work was an important memoir on the percussion of bodies by 
Louis Poinsot. This gave him the opportunity to visit Poinsot at his house, where 
he was met by a footman and conducted to an elegant salon to meet the old man. 


20th December 1857: ... He shook me kindly by the hand, bid me be seated, and took his seat 
near me. He is now between 60 and 70 years old, with silver silken hair neatly arranged on a fine 
intelligent head. He is tall and thin, but although he now stoops with age and feebleness one can 
see that one time his figure was more than ordinarily graceful. He was loosely but neatly dressed 
in a large ample robe de chambre. His features are finely moulded—indeed everything about the 
man betokens good blood. His eyes are now dim and dull with age, and recede far behind two 
prominent eyebrows. He talks incessantly and well. I did not misunderstand a word, although he 
spoke always in a low tone, and now and then his voice dropped as if from weariness, but he 
never wandered from his point... 


Poinsot was delighted to discuss his works with Hirst, and was clear and interesting 
in his explanation of them. He seemed touched to hear of his influence on the 
young Hirst, remarking ““We cast our seed upon the waters knowing not where it 
may fall, but it is nevertheless pleasant after long years of labour to find that these 
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seeds have taken root.” He presented him with copies of all his works, which 
pleased the recipient greatly. Hirst obviously read them, for he was soon to 
write... 


10th January 1858: Without exception Poinsot’s is the neatest and most lucid mathematical 
treatise I know. I find it difficult to put down the book just as in my younger days I found it 
difficult to put aside an interesting novel. Poinsot is one of the few mathematicians who dislike 
to leave to calculus the task and the merit of arriving at results. With most of us calculation is 
more than an instrument in our hands, it is a servant in our service to which servant we appoint a 
task and are but too prone to accept the result he brings to us without enquiring how it has been 
achieved—Poinsot on the contrary works with this servant, watches his every act and directs the 
same. The consequence is the result is thoroughly his own... Every thing he touches he strives 
to exhaust, he is not satisfied with a simple preception of a truth but he regards it from all sides 
laboriously and perseveringly until he has found out the path which will lead himself and others 
most directly and easily to the goal. For young mathematicians I should deem him an admirable 
instructor. 


In January 1858 he received copies of a memoir he had written for Liouville’s 
Journal, and ‘saw with some little pleasure my name amongst the list of contribu- 
tors on the cover’, names such as Cayley, Gauss, Jacobi and Dirichlet. But this was 
not the only exciting event of that month... 


17th January 1858: On the evening of this same day the Emperor of the French [Napoleon III] 
narrowly escaped assassination at the entrance of the Grand Opera. As usual a crowd was 
assembled in the Rue Lepeletier to see the arrival of the Emperor and Empress. As their 
carriage drew up three loud detonations were successively heard, three infernal machines 
(grenades) exploded under or near his carriage killing and wounding more than a hundred of the 
spectators, smashing his carriage and slaying one of the horses... 


His mathematical interests now took a new direction, as his work on equally 
attracting surfaces continually caused him to deviate into geometry. 


31st January 1858: ... Having found that two surfaces inscribed in the same cone attract the 
vertex of the latter equally, provided that radii vectors having the same direction are inversely 
proportional in length I am led to study what I call inverse figures generally. I call two figures 
inverse with respect to a point O chosen as the centre of inversion, when to every point A of the 
one corresponds a point A’ of the other so that A and A’ are on a line through O and the 
rectangle AO. AO is constant... 


14th February 1858: ... My method of inverse transformation is leading me to a class of curves 
of the’ fourth degree which possess properties precisely analogous to but more general than 
conics. To every theorem in conics concerning points, lines and circles corresponds another with 
reference to these higher conics concerning points and circles. Conics are as it were turned 
inside out, their infinitely distant points becoming all concentrated in one point in the plane 
which I call the point of inversion. 


Although he was pursuing his researches in mathematics, he maintained a strong 
interest in the sciences. He was particularly fascinated by the election to the 
membership of the Mechanical Section of the Academy of Sciences. 


7th March 1858: ... Foucault is a candidate. I noted last night that Chasles will not and 
Bertrand will support him. His not being a mathematician will in all probability be fatal. Chasles 
designates his gyroscope and researches on the pendulum as happy, but neither indicative of 
genius or promising in results... 
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... With respect to Bertrand I am still in 
doubt whether his harsh, forbidding, arrogant 
exterior is a true index of his character or 
merely a cloak to a better nature. To me it is 
extremely disgusting, the air he assumes. His 
manner to me appears to repel you by the 
announcement “what you are telling me may 
interest you, but as to me I knew it all before 
and much more—in fact with respect to 
mathematics I am decidedly blasé, I may be 
said to have utterly exhausted that elementary 
science.” 


Joseph Bertrand (1822-1900) 


By April, his investigation on equally attracting surfaces was drawing to a close. 
Although his work had proceeded well, he was unsure of its interest or quality. 


25th April 1858: ...It is strange with what different feelings I regard at different times the 
results of my researches. Sometimes they appear to me of tolerable interest and value, at other 
times merely curious and common-place. Whatever they may be I hope soon to throw them aside 
to the indifferent public and occupy myself with others. 


6th June 1858: ...I have succeeded in integrating some partial differential equations that have 
caused me much trouble... I felt convinced that simple results ought to have been obtained and 
in fact I found after a while that a mistake where a’ was merely put in place of a had caused all 
the mischief. The thought of the three lost days was as nothing in comparison to the pleasure of 
seeing complication Vanish and former results more than corroborated... 


Ever since his Marburg days his health had caused him problems, which he 
frequently described in his diaries. In particular, toothache was a recurring 
problem... 


13th June 1858: I have undergone the very unpleasant operation of burning the nerve. It has 
changed the nature of the tooth-ache, but not cured it. One night John Martin put me a leech on 
my gum and it bled profusely for nearly 24 hours... 


Despite such problems, his work progressed well, and by the end of July he had 
finished his memoir on equally attracting surfaces for the Philosophical Magazine. 
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In August, he left Paris to spend almost a year in Italy—an exciting time to Visit, as 
Italy was in the midst of Civil War. 


her volunteers giving her three days to consider her reply. This news appears to be authentic. 
French troops are quickly moving towards the frontier, it is said they are in Genoa to-day. At any 
rate a fearful Struggle has commenced and God knows how it will end. Its effects will be stamped 
upon the Century for ever... 
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Francesco Brioschi (1824—1897) Luigi Cremona (1830-1903) 


23rd June 1859: - he is beyond doubt the ablest mathematician of Italy. He is a rather tall 
Slightly built man with an intelligent earnest face, dark hair and beard and high good forehead, 


Mathematicians of Europe... He deems Cayley about the 1st mathematician of Europe, 
Hermite the first in France and Kronecker perhaps in Germany. He differed Slightly as to the 
merits of Liouville and some others but agreed perfectly as to Bertrand, Chasles, Steiner &c. a 
In short, of all the mathematicians I have met in Italy he produced upon me the best impression. 


Cremona. 


30th June 1859: ... He is a young man, a pupil of Brioschi’s, married and has a family. He is 
Short and has a bullet Shaped bald head. Our conversation was first of all] political and then 
mathematical; it never flagged and we parted good friends. 
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After two years abroad, Hirst decided that it was time to return home. After a 
brief visit to see friends in Marburg and visit Anna’s grave in Paris, he set sail for 
England. The next few years in London were to be the most successful of his 
career, and form the topic of the next article. 
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ON THE CHINESE ORIGIN OF THE 
SYMBOL FOR ZERO. 


By PROFESSOR FLORIAN CAJORI. 


I have just received a letter from Mr. Y. 
Mikami, of Tokyo, Japan, containing informa- 
tion which (if confirmed by more extended 
research) is of great interest and importance. 
The letter is dated December 15, 1902. From it 
I quote the following: 

“T have found very important relations be- 
tween the mathematics of India and of China. 
Arabian numerals seem to be of Chinese origin. 
The abacus, used by the Chinese from time 
immemorial, probably afforded the principle of 
position. In China the use of the symbol 0 for 
zero seems to have been very old. I desire to 
study the history of the Chinese mathematics 
from this point of view, if only I can secure 
sufficient materials, which is, however, very 
difficult. Chinese works are not [difficult] to 
understand for us Japanese, because we use the 
same letters.” 

Until recently the symbol for zero and the 
principle of local value in our notation of 
numbers were supposed to be of Hindu origin. 
A few years ago our attention was called to the 
early work of the Japanese, and now the 
priority appears to be passing to the Chinese. 


COLORADO COLLEGE, COLORADO 
SPRINGS, January 3, 1903. 
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Thoughts on Innumeracy: Mathematics 
Versus the World? 


Peter L. Renz 


(A reply by John Allen Paulos follows.) 


To some, mathematical calculations are soothing and reassuring. The ability to 
calculate gives them a sense of power. Speaking of an instance in school when his 
calculation was right and his teacher was wrong, John Allen Paulos wrote: 


I remember thinking of mathematics as a kind of omnipotent protector. You could prove things 
to people and they would have to believe you whether they liked you or not. 
(Innumeracy, page 73) 


Yet his teacher did not believe Paulos’s calculation and he didn’t acknowledge 
that Paulos was correct even after seeing Paulos confirmed by figures in the 
Milwaukee Journal. 

Calculation has its limits in conquering disbelief, and it has others. As basis for 
practical decisions or for science, calculation is limited by the accuracy of the data 
and the correctness of the assumptions on which it is based. Lord Kelvin calculated 
the age for the Earth based on the rate at which this planet cooled after its 
formation. He arrived at 20 million years, with 40 million years as a maximum. His 
calculations were correct; his assumptions were wrong. He did not know of the 
warming of the Earth’s interior by radioactive decay. The current best estimate for 
the age of the Earth (again a calculation, this one based on radioactive dating) is 
4.7 billion years—100 to 200 times the age that Kelvin estimated. 

The relentless and immutable nature of calculation, and of mathematics in 
general, is an affront to some. Among the offended are the circle squarers, the 
angle triséctors, and.the like. These people are John Allen Paulos’s innumerates. 
Their weaknesses lead to diverse problems: 


One rarely discussed consequence of innumeracy is its link with belief in pseudoscience. 
(Innumeracy, page 4) 


In addition to astrology, innumerates are considerably more likely than others to believe in 
visitors from outer space. 
(Innumeracy, page 59) 


... healthy skepticism ...a state of mind generally incompatible with innumeracy. 
(Innumeracy, page 62) 


Paulos gives no quantitative evidence for these commonplace assertions. Paulos 
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attributes innumeracy to character faults: 


Some people personalize events excessively, resisting an external perspective, and since numbers 
and an impersonal view of the world are intimately related, this resistance contributes to an 
almost willful innumeracy. 

(Innumeracy, page 80) 


But numeracy helps lift us out of the mire of personal concerns. 


If you... see happy people holding hands, eating ice cream cones, etc., it’s easy to begin to think 
that other people are happier, more loving, and more productive than you are, and so to become 
unnecessarily despondent...It’s beneficial to wonder occasionally what percentage of people 
you encounter suffer from this or that disease or inadequacy. 

(Innumeracy, page 81) 


There is a hostile and patronizing tone here and an evident lack of sympathy for 
the innumerate (pity or scorn, yes; sympathy, no). These set my teeth on edge. 
There is an arrogance and disregard for the difficulties of others and the difficul- 
ties of applying mathematics to real problems that reflects poorly on our subject. 
Consistent with this, Junumeracy is flawed by a cavalier disregard for accuracy. Yet 
despite these faults, this book is a best-seller. Why? 

The answer is that we already see innumeracy, however defined, as a general 
problem (probably in ourselves and certainly in others). Here is a book that 
confirms a common perception, suggests a ready cure, and does all this with 
amusing banter and fun number facts. Let me tempt you with this sample: 


... take a deep breath. Assume Shakespeare’s account is accurate and Julius Caesar gasped 
“You too, Brutus’ before breathing his last. What are the chances that you just inhaled a 
molecule which Caesar exhaled in his dying breath? The surprising answer is that, with 
probability better than 99 percent, you did just inhale such a molecule. 

(Innumeracy, page 24) 


Fascinating, and for those who don’t believe him Paulos gives the reader a quick 
calculation to prove his point. Did I believe it? No, and here is why. 

Paulos states that the number of molecules in the atmosphere is about 10%. 
Where did this number come from? I had no idea, and Paulos gives no clues, but 
by digging around in The Handbook of Physics and Chemistry 1 found figures for 
the mass of the atmosphere and the molecular constitution of the atmosphere that 
made his number a reasonable estimate. Next, Paulos states that a breath is 4th of 
a liter and contains 2.2 x 1022 molecules. As we shall see, this is wrong on two 
counts. First, a gram molecular weight (mole) of any gas at standard temperature 
and pressure fills 22.4 liters and contains 6 X 107? molecules. The number 6 x 10” 
is Avogadro’s number, the number of molecules in a mole of any compound. A 
quick calculation shows that Paulos should have gotten 


1 1 
— x — x 6x 107% = 8.9 x 10” 
30 22.4 


molecules per breath instead of 2.2 x 1077. But let’s follow his calculation as he 
made it, using his number of molecules per breath. Suppose all the molecules in 
Caesar’s last breath are uniformly mixed up in today’s atmosphere. (Is this 
reasonable?) To get a handle on this, let’s call the molecules from Caesar’s last 
breath “lucky” and all other molecules “unlucky.” The probability of a random 
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molecule’s being lucky is just 


Number of lucky molecules 2.2 x 10” 
= — — = 2.2 10” 
Number of molecules in atmosphere 10 

The probability of a random molecule’s being unlucky is 

Number of unlucky molecules 10“ — 2.2 x 10” 

Number of molecules in atmosphere 7 10% 
2.2 x 10” 
~ 1 9 


=(1-2.2x 10°”) =@. 


We call this number Q. 

The probability of two random molecules being unlucky is effectively Q xX Q. 
(The second draw is not independent of the first because this is sampling without 
replacement. Calculation shows that the adjustment for dependence leaves the 
first twenty or so significant figures unaffected and can be neglected. Paulos makes 
no mention of this, although dependence can be important and the observation 
that it can be neglected here is a nice exercise in approximation.) The probability 
of your whole lungfull of molecules (all 2.2 x 107” of them according to Paulos) 
consisting only of unlucky molecules is then just 


P = Q22*10” = (1 — 2.2 x 107-22)2?™ 0", 

Paulos tells his reader that this product is less than 0.01. True, but how would 
an even moderately sophisticated reader calculate (1 — 2.2 x 10~22)2?*!99 You 
can’t use your pocket calculator because 1 — 2.2 x 10~*” figured on a calculator is 
1 and the exponent is out of range. Repeated multiplication is out of the question; 
it would take too long. You must use natural logs or the definition of e. Either 
approach uses calculus and yields 


P =e **4 = 0.0079. 


This is Paulos’s probability of a whole lungful of unlucky molecules. So his 
probability of at least one lucky molecule in a random lungful is 1 —- P=1 —- 
0.0079 = 0.992 or better than 99%. 

What are my complaints? First, no innumerate (and relatively few numerates) 
could fill in the steps. Second, Paulos’s numbers are wrong. If you use his ath of a 
liter per breath, the calculation gives the probability of a random breath’s not 
containing a moleculé of Caesar’s last breath as 


89 x 102 8.9 x 107° 
Pp’ = | — | = 0.992. 


10%4 


So the probability of getting a lungful of unlucky molecules is 0.992. (By coinci- 
dence, this number matches one in Paulos’s calculation, but it gives the comple- 
mentary probability.) Continuing, the probability of getting at least one lucky 
molecule in a lungful is 1 — 0.992 = 0.008, or less than 1%—contrary to what 
Paulos writes. 

What does this tell us? First, you get wrong answers from bad numbers. Second, 
when simple operations like addition, subtraction, multiplication, and raising to a 
power are taken to extremes, special techniques must be used. Third, it is not easy 
to dig up good values for the numbers needed in many calculations. 
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This calculation is mentioned without details in J. E. Litthewood’s A Mathemati- 
cian’s Miscellany, and Littlewood credits James Jeans. I tracked this to Jeans’s An 
Introduction to the Kinetic Theory of Gasses, Cambridge University Press, 1942. 
With a breath of 0.4 liter, 10°” molecules, this is also the number of such breaths 
in the atmosphere, which Jeans puts at 10“ molecules. With proper mixing, each 
breath could contain a molecule of Caesar’s last breath. No fanfare. Jeans’s 
numbers are good and his calculation is immediate. Paulos’s calculation is tricky 
and his volume for a breath too small. The volume is close to 1/2 liter (more for a 
deep breath). Paulos did not check the volume of a breath either by experiment or 
in references. I looked at Human Respiration by Olof Lippold, W. H. Freeman and 
Company, San Francisco, 1968 and I experimented as well. 

The hypothesis of random mixing of the molecules of Caesar’s last breath in the 
atmosphere is dubious. There is no evidence that Paulos checked this. There are 
several problems concerning this mixing. Molecules of air dissociate and can 
recombine forming other molecules or react to become part of the biosphere, 
hydrosphere, or even end up in sediment. Looking into this requires a bit of 
research. Nitrogen is the main constituent of air (80% of it). The amount of 
nitrogen in sediment is more than that in the atmosphere. However, interchange 
between atmosphere and sediment is quite slow. One must check on this. My 
source was Delwiche’s article “The Nitrogen Cycle,” Scientific American, Septem- 
ber 1970. These numbers are rough, but they suggest that it is safe to assume 
almost all the nitrogen molecules in Caesar’s last breath are still in the atmo- 
sphere, but it does to speak to the uniform mixing of those molecules in the 
atmosphere. 

We can work out Paulos’s calculation with an average breath of 1/2 liter, 
assuming total random mixing of the original molecules of Caesar’s last breath in 
the atmosphere, and that there is no loss of those molecules. The probability of a 
random breath’s not containing any “lucky” molecules is 


2 
1.3 x 102 1.3 x 107 
| _ 10% | _ e 13x13 — 0.16. 
So the probability of getting at least one lucky molecule in an average lungful is 
1 — 0.16 = 0.84. 


By increasing the estimated size of a breath of air you can pump this probability 
up to Paulos’s 99%. 

There is a. final-question here: What is the purpose of such a calculation? Is the 
object simply to amaze’ the reader, or is it to instruct, or is it intended to lead to 
some course of action? What do we learn from Paulos here? Jeans and Littlewood, 
speaking to those who could work out the technicalities, had clear points in mind, 
but Paulos’s purpose is unclear. 

Reviewing Innumeracy in The Washington Post, Eleanor Wilson Orr, a mathe- 
matics and science teacher for 35 years and an author writing on issues in 
mathematics education, said, “‘...for the innumerate who wants to take this book 
seriously and read it carefully, the book is intimidating. ... I learned a lot from 
this book but I spent five full days reading it with a pencil in my hand. I fiddled 
with the numbers, I drew diagrams, I daydreamed and tried to explain to myself 
what Paulos doesn’t explain. I trusted that I would understand it if I kept at it. 
Innumerates either quit or think it’s enough to get the general idea, and so remain 
innumerate.” She noted none of Paulos’s errors. Judith Axler Turner, who wrote 
an article on Paulos and his book for the Chronicle of Higher Education, com- 
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mented that Paulos scoffed at Orr’s difficulties, but I do not scoff. The last of Orr’s 
sentences quoted in on the mark. You cannot read Paulos’s book seriously without 
giving some attention to the details and that attention will require serious work. 
Not only will it require serious work but that work will reveal that there is less in 
Paulos’s book than meets the eye of the casual reader. 

Here is another numerical problem Paulos poses and answers. It is equally 
amusing but seems more practical. 


One last earthly calculation that a scientific consultant from M.I.T. uses to weed out prospective 
employees during job interviews: How long, he asks, would it take dump trucks to cart away an 
isolated mountain, say Japan’s Mount Fuji, to ground level? Assume trucks come every fifteen 
minutes, twenty-four hours a day, are instantaneously filled with mountain dirt and rock, and 
leave without getting in each other’s way. The answer is a little surprising and will be given later. 

(Innumeracy, page 12) 


The answer, without explanation, appears in a sentence on page 15 where 
Paulos estimates it would take 5,000 to 10,000 years to truck away Mount Fuji. 
This is a surprisingly short time for such a job. It is also wrong. The only fact that 
Paulos mentions about Mount Fuji is its height, 12,000 feet, so it is clear that he 
figured the mountain was 
some sort of cone. The vol- 
ume of a cone is a third of 
its base area multiplied by 
its height, a fact easily de- 
rived and known to 
Archimedes. Evidently, 
Paulos must have also used 
the area of the base of 
Mount Fuji in his calcula- 
tion. Did he look this up? 
Did he consult maps? No, 
as an exchange of letters 
revealed, he dreamed it up. 
He assumed Mount Fuji 
was a cone as wide at its 
base as it was high. Vol- 
canos are simply not shaped 
this way, and one might 
expect Paulos, who spent a 
year at the University of 


Washington within easy The gentle slopes of Mount Fuji are shown here. Photograph 
view of Mount Rainier, to courtesy of the Japan National Tourist Organization. 


know something about the 

shape of a volcano. Leaving 

that aside, you might expect him, as an author, to look at an atlas. I did. The map 
is revealing. It shows that Fuji is roughly conical and has a radius of about 12 
kilometers at its 1000 meter contour. Its height is 3776 meters above sea level. 
Below 1000 meters it broadens out considerably. We might construe the problem 
of trucking away Mount Fuji as that of taking enough of it away so that what was 
left would blend into the countryside. From the map it looks as if taking the top 
2776 meters off the peak would do the job. The volume of that part of the 
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mountain is 
1 x ar(12,000 m)* x 2776 m = 4.19 X 10"! m3. 


Calling a local importer of heavy Japanese trucks, I found that the largest 
standard model that they imported could carry 18.5 cubic yards. Round up to 20 
cubic meters per load, and divide by the product—cubic meters per load times 
loads per hour times hours per day, etc.—and you will find that it would take 
about 600,000 years to truck away the top 2776 meters of Mount Fuji, To cart away 
a cone the same shape and the height of Mount Fuji measured from sea level 
(3776 meters) would take over 1.5 million years at this rate. Paulos’s estimate of 
5,000 to 10,000 years is off by orders of magnitude. Would his mythical M.I.T.-based 
recruiter have hired him for some practical job? I would hope not, but given the 
errors committed in real-world engineering, perhaps so. Note that even with these 
considerations, this is a highly idealized problem. It is clear that no such project 
could ever be carried out. 

Here is an example of real-life erroneous calculation with a potentially large 
impact. These calculations were made by William R. Sears and Irving L. Ashkenas 
in a secret assessment of promising aeronautical technologies that they prepared in 
1945. Sears and Ashkenas built a mathematical model to show how the range of an 
aircraft varied as one redistributed the volume between the wing and the fuselage. 
Sears and Ashkenas were working at the time for the Northrop Corporation, a 
firm then building various experimental “flying wing” aircraft. They differentiated 
their formula for range as a function of the percentage of volume in the wings and 
found only two possible extrema: one of these was when all the volume was in the 
wing and the other when a much smaller fraction of the volume was in the wing. 
Sears and Ashkenas wrote, “It can be ascertained that the form [all volume in the 
wing| gives maximum range, while the latter gives a minimum.” Hence, flying wings 
have the maximal range. 


' 
agg tn ts ' 
“ ; aay 
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The Northrop YB-49 Flying Wing, left, and its sleek delta-form descendant, the B-2 Stealth bomber, 
right. Both pictures courtesy of the Northrop Corporation. 


Joseph Foa, who headed a group studying possible designs for an unmanned jet 
aircraft at Cornell Aeronautical Laboratory (CAL) had reached the contrary 
conclusion—that a flying wing configuration would not give maximum range. After 
Sears came to head CAL, Foa had a chance to examine the Sears-Ashkenas report 
and discovered that the critical point associated with the flying wing configuration 
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was a minimum, not a maximum. It gave the minimum range according to their 
model, not the maximum range, as Sears and Ashkenas claimed. 

This came to light in Science (Volume 244, pp. 650-651, 12 May 1989) in 
connection with controversy about the B-2 stealth bomber, also a flying wing. In 
the 1940s Foa kept his silence on the condition that Sears and Ashkenas publish 
some sort of correction to their earlier analysis. That correction took the form of 
the 1948 paper “Range performance of turbojet airplanes’ by Ashkenas in the 
Journal of the Aeronautical Sciences. Here Ashkenas made a much more complex 
mathematical model, which had the property that the flying wing configuration 
gave optimal range for certain choices of the basic parameters. To this day Foa 
remains unconvinced, asserting that the Ashkenas optimum flying wing would be 
impractically thick. 

The B-2 project had a multi-billion dollar budget. The initial error of Sears and 
Ashkenas, mistaking a minimum for a maximum, is a classic for students in 
freshman calculus. But even after careful consideration by competent aeronautical 
engineers, it is not clear whether the flying wing is the best or the worst way to go 
if one wants a long-range plane. The answer you get from a mathematical model 
seems to depend on what answer you want to get. 

Our quantitative understanding of the world is not simply based on assump- 
tions; it is based on observation. Generally there is a lot of hard work needed to 
get good numbers. When it comes to projecting cancers that may or may not be 
caused by the breakdown products of minute quantities of Alar in apples, the work 
is hard, the numbers are soft, and the theoretical apparatus is quite involved. The 
meaning of such calculations is more controversial, important, and uncertain than 
for the calculations I have discussed above. Paulos gives scant attention to any of 
this. 

Paulos is quick to point out the problems of others. 


A recent study by Drs. Kronlund and Phillips of the University of Washington showed that 

doctors’ assessments of the risks of various operations, procedures, and medications (even in 

their own specialities) were way off the mark, often by several orders of magnitude. 
(Innumeracy, page 8.) 


Paulos continues in the cited paragraph, heaping scorn on doctors, “‘I once had a 
conversation with a doctor who, within approximately twenty minutes, stated that a 
certain procedure he was contemplating (a) had a one-chance-in-a-million risk 
associated with it; (b) was 99% safe; and (c) usually went quite well. Given the fact 
that so many doctors seem to believe that there must be at least eleven people in 
the waiting room if they’re to avoid being idle, I am not surprised at this new 
evidence of their innhumeracy.” Let’s think this through. If (a) is true, then (b) 
follows, because it is a weaker condition. Furthermore, it is reasonable (c) might 
also be true. A doctor might say that such a procedure “usually went well,” 
although this is a qualitative judgment having to do with ease of the procedure and 
lack of difficulties for the doctor. Now, I expect the doctor in question did not have 
much detailed statistical information, because such information is difficult and 
costly to gather. But does what Paulos present show the doctor to be innumerate? 
Not at all, and the line about waiting rooms is a cheap shot meant to please 
readers who have cooled their heels in a doctor’s office. 

The importance and the difficulty of gathering good data on medical matters 
are greater than one might think. Let me use an example from Jnnumeracy. Paulos 
begins a calculation on page 21 by stating that the probability of heterosexual 
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transmission of AIDS from an infected to an uninfected person is 0.002 per act of 
intercourse. This probability is called the infectivity. He says this is an average of 
figures from several studies—but he cites no sources. This makes me wonder, 
because it is difficult to see how one could get good figures on the transmissibility 
of this disease. Only a Dr. Mengele operating without restraint could plan and 
execute experiments on AIDS infectivity. The problems include the facts that 
AIDS is sexually transmitted and 100% fatal. It is extremely difficult to get 
accurate information about sexual behavior. These matters are private and sensi- 
tive. People regularly lie about sexual matters, and Congress regularly kills pub- 
licly-funded studies of sexual behavior. 

I asked experts, including Eric Lander who organized the National Academy of 
Sciences session on AIDS, about reliable information on the heterosexual trans- 
missibility of AIDS and came up with little. The best source I found was the issue 
of Los Alamos Science, Number 18, 1989, devoted to AIDS. The lead article, 
“AIDS and a Risk-Based Model,” by Colgate, Stanley, Hyman, Qualls, and Layne, 
gives estimates for the infectivity ranging from 0.0014 to 0.004 . These authors cite 
others whose estimates of the infectivity run from 0.003 to 0.1. The methods 
discussed are difficult and use epidemiological data and complex assumptions. 
There is a factor of about 100 between the largest and smallest of these estimates 
of infectivity, and Paulos’s 0.002 falls within the range, on the low side. It would be 
prudent not to put too much faith in any particular number for the infectivity. 

Let’s see what use Paulos makes of this probability. He says we may assume 
these probabilities of transmission to be independent. He then notes that (1 — 
0.002)*4° = 0.5. So that a year’s worth of nightly unprotected intercourse with an 
infected partner leaves you with a better than 50% of being uninfected. Next, he 
asserts that if a condom is used the infectivity drops to 2 X 107+. Now you can 
enjoy ten years of nightly intercourse with the victim (assuming the victim lives this 
long, Paulos adds parenthetically) before your probability of getting AIDS rises to 
0.5. Finally, Paulos states that the probability of contracting AIDS from a single 
act of unprotected intercourse with someone belonging to no known risk group is 
2 xX 10~/ and with a condom this probability drops to 2 X 107°. He writes that you 
will more likely die in a car accident returning from such a tryst than catch AIDS 
during the act. All this suggests that one need not worry all that much about 
AIDS. Now these calculations are correct, though the assumptions underlying 
them are dubious, and the suggestion that AIDS is not very worrisome is dead 
wrong. 

AIDS transmission is extremely variable. It appears that an individual can be so 
infectious as to infect virtually everyone with whom he has unprotected sexual 
intercourse. The evidence for this comes from an Australian sperm donor. His 
frozen sperm sample was split into ten doses, eight of which were used, resulting in 
four infected women. A Poisson model is suggested for assaying infectivity by 
dilution methods in “The Kinetics of HIV Infectivity,’ by Layne, Dembo, and 
Spouge in the cited issue of Los Alamos Science. Let N be the number of 
individuals treated with the diluted infectious agent (here, N = 8). Let d be the 
dilution that infects 50% of the individuals treated (here d = 0.1), and J be the 
infectivity of the undiluted semen. Then in this instance 


Number infected = 4 = 0.5N = Ne“ = Ne~°1/, 


The exponential factor comes from the Poisson probability, 


py, = ed) */k! 
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with k = 0. Using this we can estimate the probability of infection from a single 
act of intercourse with this donor at the time of donation 


1 
1 — Probability of no infection = 1 — e7/ = 1 —- 510 = (1).999 


where the probability of no infection is simply obtained from the Poisson py with 
d = 1 and J evaluated from 0.5 = e 

Not much data is available. But to illustrate the variability of transmission of 
AIDS, I quote another example. Sperm from an infected New York donor was 
used to inseminate 90 women, none of whom contracted AIDS. Even given 
selectivity in reporting, it is unlikely that the cited Australian and New York 
examples are samples drawn from the same population. 

What of Paulos’s comment about the risk of being killed in an automobile 
accident versus the risk of getting AIDS? It is a common and generally uncalcu- 
lated guess that the risks of doing X are less than the risk of driving to the place 
where you do X. From Paulos’s figures, it might be the other way around in this 
case. The average US passenger death rate as given in The World Almanac and 
Book of Facts: 1992 is 1.12 x 107° deaths per mile. Compare to Paulos’s estimate 
of a risk of 2 x 107° for contracting AIDS from protected intercourse with 
someone having no known risk factors. The risk that dominates will depend, at 
least, on how far away the tryst is. More importantly, the death rate per mile 
depends strongly on the age, sex, and driving history of the driver and on such 
things as sobriety, roads and road conditions, and on whether a seat belt is used. 
Your risk per mile could be quite a bit larger or smaller than the average which I 
gave. As we begin to bring in these considerations, we move from a general 
statistical treatment toward special cases and special pleading. More data would be 
needed to establish the risks for these new classes. This leads away from easy 
calculation and toward more specialized cases. This sort of thinking is a poor guide 
for public policy, but we live or die as special cases, not on the average. 

This simply hints at what is wrong with the lax, breezy treatment Paulos gives. 
When applied to so serious a matter as AIDS, it is shocking. Yes, innumeracy is a 
problem, but Jnnumeracy is more a part of the problem than of the solution. This 
need not have been the case. Were the book less negative toward the innumerate 
and more carefully done, it could have made a wonderful contribution. My copy 
has quite a few favorable comments in the margins along with many notes on 
errors of the sort I mentioned here. 

We should hold ourselves, our students, and others to higher standards. We 
want public appeal, clarity, and truth. This will not be easy to get, but why settle 
for less? 


101 Colchester Street 
Brookline, MA 02146 


John Allen Paulos replies: 


A distant relative of mine recently returned from a three-month stay in Florida. 
When I asked him how it was, he launched into a detailed disquisition on the 
mechanical minutiae of his new appliance for extracting juice from oranges and 
gerapefruits. He took offense at my attempt to summarize his comments or to use 
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Reflections on Rippling Water 


Michel Mendes France 


1. INTRODUCTION. On a summer evening standing beside a large lake that 
extends to the horizon, we observe the moon’s reflection on the rippled surface of 
the water. When the moon is low, but nonetheless completely above the horizon, 
the reflection may still appear as a long uninterrupted yellow column which 
stretches from some point on the lake to the horizon. Its length can be considered 
as infinite. Later, when the moon rises higher up in the sky the reflection changes 
aspect and becomes a shorter beam, in the shape of a narrow oval. It is closer to us 
and no longer extends to the horizon. Its length is now finite. 

Stars may appear in this evening sky. If a gentle breeze is blowing, each one of 
these stars will appear to be reflected an odd number of times in a given direction 
on the surface of the lake. 

These evocative images raise interesting mathematical questions. At what angle 
does the moon’s reflection change from an infinite image to a finite one? Is it 
possible to see exactly two reflections of the same star? The object of this paper is 
to answer these questions. Our analysis only requires simple trigonometry. 


2. THE THEORY. Suppose an observer at height H, sees a reflected object on 
the wavelet M at a distance x across the water. 

Let a = a(x) be the angle measured in radians between the normal MN to the 
wave with the vertical V. Let a, be the maximal value of |a(x)| and define (x) by 
a(x) = agy(x) so that |p(x)| < 1. Let i be the angle of reflection (Figure 1 and 2). 


Figure 1 
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Figure 2 


Trivial trigonometry shows that 


Hence 
tan2a = tan[(a +i) — (i-a)| 


tan(a +i) — tan(i — a) 
1 + tan(a@ + i)tan(i — a) 


x x-l 
— + 
-—T0-x) * Hit+hox 
1,79) yea ae 
HA, 


We now assume that a, is small. Then 

2a = ——————- 
H,H,+ Ix -—x 

Finally, 


(H, + H,)x — 1H, ' 
~ 2ao( Hy Hy + lx — x7) (1) 


l 


p(x) 
Before exploiting the relationship given by (1), let us analyze the corresponding 
equation 


x(H,+H,) — 1H, 


°C) = 90, (H,Hy + lx — x?) 2) 
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Note that each side of (2) has a physical interpretation. The function @ describes 
the shape of the waves while the right-hand side represents the distance at which a 
reflection occurs. So, for a given shape @, the solutions of (2) are the approximate 
distances at which a reflection occurs. In particular, the number of solutions is the 
number of reflected images we see. 

Now let us analyze the equation. We start by looking at the simplest case when 
there are no waves at all: o = 0. Then 


lH, 


x= =. 
H,+H, 
Thus there is only one reflection so that the observer sees a perfect image of the 
object. If, in particular H, = H,, then x =//2 and the light ray is reflected at the 
midpoint between the observer and the object. This is of course well known. 

Let us now discuss the general case where @ stays small. We solve the equation 
(2) graphically. 

Let B be the curve 

Bix ——————_—~.. 


In the interval (0,/), B is continuous and increasing. Furthermore 


d B(l) = | 
Ja,H, i PU) so 


B(0) = - 
We assume that both H, and H, are strictly less than //2a@, (J is large and ap is 
small). 

The curve x > g(x) oscillates in the horizontal strip y = —1, y = +1. Suppos- 
ing is continuous in the interval [0,/], both curves intersect either at an odd 
number of points or infinitely often. Thus, whatever the shape of the waves may 
be, one should see either an odd number of reflections, or infinitely many. (This 
last case may indeed occur if, for example, g@ has a singularity of the type 
(x — a)sin(x — a)~! in the neighbourhood of some a € (0, /)). 


Figure 3 
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3. LIMITING CASES. Let us study the solutions of equation (2) when / = +0 
(reflection of the moon or the sun... ). 

We denote by w the angle at which the infinitely far away object is seen. Then 
H,/l is negligible and H,/l =tanw. Thus rewriting the right hand side of 
equation (1) in the form 


x(HI-} + HAI") — H, 
x? H.H 
4 itt 


we have 


x tanw — A, 


= ——______—_—— = y(Xx). 
2a)(x + H, tana) YC) 


p(x) 


As before we solve the equation g(x) = y(x) graphically and we suppose that © 
oscillates a great many times, say 


p(x) = sin Ax 
where A is large. We solve equation (3) for x € (0, /) 
sinAx = y(x). (3) 


Since y(x) is monotonically increasing we know that the smallest solution x, of (3) 
occurs when this function is —1 and the largest solution x, occurs when the 
function is +1. 

When tan w < 2a, we see from Figure 4 that the two curves intersect infinitely 
many times and the smallest solution is approximately 


1 — 2a, tanw 
X 5 = max 0, H;, ——————— 
tanw + 2a, 
In this case, the reflection on the water extends from x, to the horizon. When 


tanw > 2a, the solution is shown on Figure 5. 


Figure 4 
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Figure 5 


In this case there is only a finite odd number of reflections and the reflections 
lie between x, and x, 


Hi, 
x, * ——— (1 + 2a, tanw). 
“ tanw — 2a,‘ ° ) 


It follows that the critical w, at which the reflection ceases to be infinite is 
therefore 
w, = tan” '(2aq). 

AS @p is assumed to be small, we have 

W. =~ 2a. 
Finally, if one is given the shape of the waves as a Cartesian equation 

y= (x), 
then 

a(x) = tan“*y'(x), 

provided w is differentiable. As |a(x)| is small, this entails a(x) = w(x) so that 
the critical angle is 


w, = 2max|y'(x)|. 
Xx 


4. AN APPLICATION. The amplitudes of real waves on the ocean, far away from 
the coast (say 100 yards or more) are difficult to measure: they move rapidly and 
their size may be small, especially if we are discussing wavelets or even ripples. On 
the other hand, w, can be quite easily measured at sunset: observe at what angle 
w. the reflection starts to touch the horizon. If we assume that during that time of 
waiting, the waves keep approximately the same shape, say 


p(x) = a, sin Ax cos Act 


where c is the velocity of the wave and tf is time, then the knowledge of w. and of 
the frequency A gives us the amplitude A of the waves 
Ww 


A=—. 
2d 
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Our analysis is also valid for studying the microscopic structure of a glossy surface. 
The macroscopic observation of a reflecting luminous point provides information 
on the fine structure of the surface. Determining w, measures the product AA. 

It was only after completing this work that I discovered M. Minnaert’s delightful 
book [1] on the “Nature of Light & Colour in the open air.” It discusses related 
topics and I highly recommend it (see in particular pp. 23-26). I wish to thank the 
referee and Jacques Harthong for helping me to improve the exposition and the 
graphs. 


Addendum. Many authors have studied the reflection on rippling water. I would 
like to single out M. V. Berry’s beautiful article “Disruption of images: the 
caustic-touching theorem,” J. Opt. Soc. Am. A. 4, 1987, pp. 561-569. 
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PICTURE PUZZLE 
(from the collection of Paul Halmos) 


Are they related? 
(see page 809.) 
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The Principal Axis Theorem over 
Arbitrary Fields 


David Mornhinweg, Daniel B. Shapiro, and K. G. Valente 


The Principal Axis Theorem, included in most undergraduate texts in Linear 
Algebra though often without proof, states that every symmetric matrix over the 
field of real numbers is orthogonally similar to a diagonal matrix. In [1], 
S. Friedberg, focusing attention on the underlying field, gave an elementary 
argument to show that there are symmetric matrices over Z,, (p a prime) which are 
not orthogonally similar to a diagonal matrix. This paper concludes with a 
problem: “Classify exactly those fields for which the Principal Axis Theorem is 
true.”’ As solutions to this classification problem can be found in the literature (see 
[2] and [10] for example) and frequent reconsiderations of this topic indicate an 
interest to a wide audience of mathematicians, the purpose of this paper is to give 
a simplified overview of this beautiful result. As we proceed, we keep an eye 
toward the accessibility of the argument. In fact, with the exception of two 
technical results, this development can be incorporated in any undergraduate-level 
course in Linear Algebra that deals with arbitrary fields. For example, one can 
show quite easily that it is necessary for the field to have characteristic equal to 
zero in order to insure that symmetric matrices are diagonalizable. While it is a 
rather straightforward matter to establish a large class of fields which allows for 
the orthogonal diagonalization of symmetric matrices, one of the aforementioned 
technical results is crucial in the final step of the classification of such fields. 

A field F is said to have the Principal Axis Property if every symmetric matrix 
over F is orthogonally similar to a diagonal matrix over F. That is, for every 
symmetric matrix M over F, there exists an orthogonal matrix P over F (that is, 
P~! = P‘) such that P~!MP is diagonal. 

A study of the 2 X 2 case provides some important information. We write 
char(F) for the characteristic of F. 

Lemma 1. Suppose evéry symmetric 2 X 2 matrix over F is diagonalizable over F. 
Then 


Gi) V-1€F, 


(ii) every sum of squares in F is a square in F, and 
Gii) char(F) = 0. 


Proof: Just suppose there exists i € F with i? = —1. Then the matrix 


ii 


has characteristic polynomial x’. If this matrix were diagonalizable, it would have 
to be the zero matrix. This contradiction establishes (i). As an immediate conse- 
quence we have char(F’) # 2, for if char(F) = 2, then i = 1 = —1 in F. 
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To prove (ii) it suffices to show that if a, b € F then a* + b’ is a perfect square 
in F. To see this we consider the matrix 


a b/2 
M = i 0 | 


which has eigenvalues (a + Va? + b”). Since M is diagonalizable over F, we 


know that these eigenvalues lie in F, so that Va* + b* € F. Property (iii) now 
follows from (i) and (ii), for if char(F) = p > 0 then —1 =(p — 1) is a sum of 
squares in F. QO 


Every field satisfying the conditions of Lemma 1 must possess an “ordering”. To 
explain why this is so, we outline some of the properties of ordered fields. These 
ideas were introduced by E. Artin and O. Schreier in the 1920’s and have since 
appeared in many algebra texts. The basic idea is that an order relation on F, 
which respects the field operations of F, is determined by the “cone” P of 
non-negative elements. This P is taken as the fundamental object. 


Definition. An ordering on a field F is a subset P CF satisfying Gi) P+ PCP; 
Gi) P-P CP; Gii) PA — P = {0}; Gv) PU -P=F. 


Here —P := {—a:a € P}. Given an ordering P, we define an order relation < 
on F as follows: a < b if and only if b — a € P. The reader is invited to derive the 
familiar properties of “less-than-or-equal” from the given axioms. For example, 
a* > 0 for every element a. This follows from (ii) if a € P. Otherwise, a € P and 
(iv) implies that —a € P, so that a” = (—a)* € P. From (i) it follows that every 
sum of squares in F must lie in P. This proves that if F admits an ordering, then 
F must be “formally real” in the following sense. 


Definition. A field F is formally real if —1 is not expressible as a sum of squares 
in F. 


From our remarks above, we see that the complex field C has no ordering. Also 
if char(F) > 0, then F has no ordering. However some fields admit several 
orderings. For instance if 0: F — R is a homomorphism into the field of real 
numbers, then P = o~ ‘([0, )) is an ordering on F. Distinct embeddings of F into 
R yield distinct orderjngs of F. For example Q(v2 ) possesses two orderings. 

Our first technical result completes the connection between ordered and 
formally real fields. 


Theorem 1. A field F admits an ordering if and only if F is formally real. 


This famous theorem was first proved by Artin in the 1920’s as part of his 
solution to Hilbert’s 17° problem. The construction of an ordering on a formally 
real field invokes the Axiom of Choice and appears in a number of texts, including 
[7], [8] and [9]. 

The second property appearing in Lemma 1 has also been given a name. 


Definition. A field F is pythagorean if every sum of squares in F is a square in F. 
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The fields of real and complex numbers are pythagorean, while the field of 
rationals is not. Using this new terminology, Lemma 1 can be restated as follows: if 
2 xX 2 matrices over F can be diagonalized over F, then F must be formally real 
and pythagorean. Therefore, in our search for fields which satisfy the Principal 
Axis Property, we may restrict our attention to formally real pythagorean fields. 
We now point out that such fields allow for the Gram-Schmidt orthogonalization 
process on F”. 


Lemma 2. Let F be a formally real pythagorean field and let > be any order 
relation on F. If u = (u,,...,u,) and v = (v,,...,0,) in F”, define 


(u,v) = u,v, + °° +U,v,.- 


Then this map «:,:): F” X F" > F is an inner product, and for every v € F” 
there exists a unique element \\v\| € F with ||v|| > 0 and |lv\|? = <v,v). 


Proof: For v € F” we see that (v,v) = v? + --- +v2 > 0 in F. Moreover if v # 0 
then some v, # 0 and <v,v) # 0. It follows that ¢ - ,- > is an inner product. Since 
F is pythagorean we know that <v,v> is a square in F. Every square in F has a 
unique non-negative square root in F, and the lemma follows. O 


We are now in a position to rephrase the question found in [5]. 


Theorem 2. Let F be a formally real pythagorean field. The following are equivalent: 


(i) F has the Principal Axis Property, 
Gi) Every symmetric matrix over F is diagonalizable over F, and 
Gii) Every symmetric matrix over F has an eigenvalue in F. 


Proof: Clearly (i) => (ii) > (ii). To prove (iii) = Gi) we let M be an n Xn sym- 
metric matrix over F and proceed by induction on n. The result is clear when 
n = 1 so we may assume n > 1. Let {e,,...,e,} be the standard orthonormal basis 
of F”. By (ii) the matrix M has an eigenvalue k € F, and there exists a 
corresponding eigenvector w € Ff”. Complete this vector to a basis of F”, and 
apply the Gram-Schmidt process to obtain an orthonormal basis {u,,...,u,}, 
where u, = w/||w||. Let P be the matrix with columns equal to these vectors u, 
and let S = P~'MP. Then P is an orthogonal matrix since the columns form an 
orthonormal basis. Therefore S is also symmetric. The first column S is Se, = 
P-'MPe, = P-'Mu, =k: P7'u, = ke,. The first row of S is then determined by 
the symmetry and we see that 


where S, is a symmetric (m — 1) X (m — 1) matrix. By induction, there exists an 
(n — 1) X (n — 1) matrix R, such that Ro’ = Rj and Rj ‘TR, = D where D isa 
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diagonal matrix. Now, set 


and consider the matrix PR. We see that (PR)~! = (PR), and 


kK 0 0 ::: O O 
0 


(PR)~'M(PR) = ° D . Oo 


We note that when working over the field of reals or real algebraic numbers one 
can appeal to the standard arguments involving the Fundamental Theorem of 
Algebra to conclude by Theorem 2 that all symmetrics can be diagonalized. In 
particular, both of these fields are real closed. For our purposes, a field F is real 
closed if it is pythagorean, formally real, and F(/— 1) is algebraically closed. 
(There are many other equivalent definitions for real closed.) This class of fields 
was also studied by Artin and Schreier and further information regarding these 
fields can be found in the aforementioned texts. 

This theorem also implies that an intersection of fields satisfying the Principal 
Axis Property again has that property, although some care must be taken to ensure 
that the intersection makes sense. To guarantee that the field operations are 
compatible, we assume that all the fields in question are subfields of some larger 
field. 


Corollary. 


(i) Any real closed field satisfies the Principal Axis Property. 

Gi) Let } be a field with {F.} a collection of subfields. If each field F., satisfies the 
Principal Axis Property, then their intersection also satisfies the Principal 
Axis Property. 


Proof: Using the definition of real closed given above, the standard argument 
establishing the existence of a real eigenvalue for a real symmetrix matrix can be 
adapted to prove (i). For a more complete development of diagonalization over 
real closed fields, one can also see [9]. 

To prove (ii) let F = —), F, be the intersection. By Lemma 1 and the subse- 
quent definitions, we know that each F, is formally real and pythagorean, and 
therefore so is F. If M is a symmetric matrix over F, then Theorem 2 implies that 
all of the eigenvalues of M lie in F,. Since this is true for every a, we see that the 
eigenvalues lie in F. The claim follows by another application of Theorem 2. O 


With this corollary we see that the intersection of any collection of real closed 
fields (that are subfields of a common field) satisfies the Principal Axis Property. 
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In fact, these are only fields with this property. To see this we are in need of a 
second technical result due to F. Krakowski [6]. As before, we must assume all the 
fields under consideration lie inside some larger field. To this end, let F be a field 
with () a fixed algebraically closed extension of F. Set 


R(F) = (){K|F CK CQ and K is real closed}. 


Note, using Theorem 1, if F is not formally real, then RCF) is trivial. On the other 
hand, if F is formally real, then R(F) is a formally real pythagorean extension of 
F. Further information regarding the construction of R(F) can be found in [3], [7] 
and [10]. 


Theorem 3. Let F be a formally real pythagorean field. For any a € R(F), there 
exists a symmetric matrix over F having a as an eigenvalue. 


Proof: (Sketch) Let V denote the field F(a) with B:V <x V > F the trace form. 
That is, 


B(x, y) = tr(xy) 


where tr is the trace mapping from V to F. Let T:V — V be the F-linear map 
defined by T(b) = ab. Since B(T(x), y)) = B(x,T(y)), T is self-adjoint with 
respect to the symmetric bilinear form B. By our choice of a, B is positive definite 
with respect to every possible ordering of F. In other words, in any diagonal 
representation of B, every diagonal entry must be a sum of squares in F and 
therefore a square as F is pythagorean. With this, one can choose a basis for V so 
that the matrix for B with respect to this basis is the identity matrix. Letting S 
represent the matrix of T relative to this basis, the self-adjoint behavior of T 
implies that S is symmetric. By construction, a is an eigenvalue of T and therefore 
an eigenvalue of S. O 


The proof of this theorem shows that if a € R(/) has degree n over F then a is 
an eigenvalue of some symmetric n X n matrix. For a more general field F and 
a © R(F) it is interesting to ask what size of symmetric matrix is required to have 
a as an eigenvalue. For example, for the field Q of rational numbers, R(Q) is the 
set of real algebraic numbers. In [1], E. Bender showed that every real algebraic 
number of degree n is an eigenvalue of some symmetric (nm + 1) X (nm + 1) matrix 
over @. The analogous question for algebraic integers as eigenvalues of symmetric 
integer matrices has been considered by D. Estes [4]. 

With this result we' can now give a complete characterization of the fields for 
which the Principal Axis Property holds. 


Theorem 4. A field F satisfies the Principal Axis Property if and only if F is an 
intersection of real closed fields. 


Proof: The “if” part is established by the Corollary to Theorem 2. To continue, let 
a € R(F). Choosing a symmetrix matrix over F having a as an eigenvalue, we see 
that a € F by hypothesis. Thus F = R(F) and the characterization is complete. 
CO 
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THE CHAUVENET PRIZE. 


The committee on the award of the first 
Chauvenet Prize for excellence in mathematical 
exposition, Professors W. C. GRAUSTEIN, ANNA 
PELL WHEELER, and A. B. VAN VLECK, chair- 
man, recommended that the award be made to 
Professor G. A. Buiss of the University of 
Chicago for his paper on “Algebraic functions 
and their divisors,” published in the Annals of 
Mathematics, . \ume 26, Numbers 1 and 2, 
September and December 1924. The Trustees 
voted to approve this choice and to thank the 
members of the committee for their arduous 
but véry valuable efforts. The award was an- 
nounced at the business meeting and the prize 
of one hundred dollars, furnished by a member 
of the Association, was presented to Professor 
Bliss following the meetings. 


33 (1926), 177 
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The Fifty-Third William Lowell Putham 
Mathematical Competition 


Leonard F. Klosinski 
Gerald L. Alexanderson 
Loren C. Larson 


The following results of the fifty-third William Lowell Putnam Mathematical 
Competition, held on December 5, 1992, have been determined in accordance with 
the governing regulations. This annual contest is supported by the William Lowell 
Putnam Prize Fund for the Promotion of Scholarship, left by Mrs. Putnam in 
memory of her husband, and is held under the auspices of the Mathematical 
Association of America. 


The first prize, $7,500, was awarded to the Department of Mathematics of Harvard 
University. The members of the winning team were: Jordan S. Ellenberg, Samuel 
A. Kutin, and Royce Y. Peng; each was awarded a prize of $500. 

The second prize, $5,000, was awarded to the Department of Mathematics of 
the University of Toronto. The members of the winning team were: J. P. 
Grossman, Jeff T. Higham, and Hugh R. Thomas; each was awarded a prize of 
$400. 

The third prize, $3,000, was awarded to the Department of Mathematics of the 
University of Waterloo. The members of the winning team were Dorian Birsan, 
Daniel R. L. Brown, and Ian A. Goldberg; each was awarded a prize of $300. 

The fourth prize, $2,000, was awarded to the Department of Mathematics at 
Princeton University. The members of the winning team were Joshua B. Fischman, | 
Adam M. Logan, and Joel E. Rosenberg; each was awarded a prize of $200. 

The fifth prize, $1,000, was awarded to the Department of Mathematics at 
Cornell University. The members of the winning team were Jon M. Kleinberg, 
Mark Krosky, and Demetrio A. Munoz; each was awarded a prize of $100. 


The five highest ranking individual contestants, in alphabetical order, were 
Jordan S. Ellenberg, Harvard University; Samuel A. Kutin, Harvard University; 
Adam M. Logan, Princeton University; Serban M. Nacu, Harvard University; and 
Jeffrey M. Vanderkam, Duke University. Each of these was designated a Putnam 
Fellow by the Mathematical Association of America and awarded a prize of $1,000 
by the Putnam Prize Fund. 

The next six highest ranking contestants, in alphabetical order, were David B. 
Carlton, Harvard University; Ian A. Goldberg, University of Waterloo; Kiran S. 
Kedlaya, Harvard University; Royce Y. Peng, Harvard University; Hugh R. Thomas, 
University of Toronto; and Tong Zhang, Cornell University; each was awarded a 
prize of $500. 
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The next four highest ranking individuals, in alphabetical order, were Ze-Yu 
Chen, Princeton University; Jonathan T. Higa, Princeton University; Svetlozar E. 
Nestorov, Stanford University; and Samuel K. Vandervelde, Swarthmore College; 
each was awarded a prize of $250. 

The next nine highest ranking individuals, in alphabetical order, were Daniel 
R. L. Brown, University of Waterloo; Jeff T. Higham, University of Toronto; F. 
Dean Hildebrandt, Harvard University; Julie B. Kerr, Washington State Univer- 
sity; Andrew H. Kresch, Yale University; William R. Mann, Princeton University; 
Dana Pascovici, Dartmouth College; Michail G. Sunitsky, Princeton University; 
and Douglas J. Zare, New College of the University of South Florida; each was 
awarded a prize of $100. 

The following teams, named in alphabetical order, received honorable mention: 
Dartmouth College, with team members Radu Bacioiu, Rolf H. Nelson, and Dana 
Pascovici; Duke University, with team members Craig B. Gentry, Alexander J. 
Hartemink, and Jeffrey M. Vanderkam; Massachusetts Institute of Technology, 
with team members Thomas C. Chou, Henry L. Cohn, and Michael J. Lawler; 
University of British Columbia, with team members Malik H. Kalfane, David L. 
Savitt, and Mark A. Van Raamsdonk; and Yale University, with team members 
Thomas Feng, Andrew H. Kresch, and Zhaohui Zhang. 

Honorable mention was achieved by the following thirty-one individuals named 
in alphabetical order: James McCleery Berger, Brown University; Sergey Brin, 
University of Maryland, College Park; Thomas C. Chou, Massachusetts Institute of 
Technology; Henry L. Cohn, Massachusetts Institute of Technology; Brian D. 
Ewald, University of Michigan, Ann Arbor; Joshua B. Fischman, Princeton Uni- 
versity; J. P. Grossman, University of Toronto; Steven S. Gubser, Princeton 
University; William M. Hesse, University of Connecticut; Adam Kalai, Harvard 
University; Timothy P. Kokesh, Harvey Mudd College; Botond Koszegi, Harvard 
University; Peter R. Kramer, Princeton University; Mark Krosky, Cornell Univer- 
sity; Tal N. Kubo, Harvard University; Sergey V. Levin, Harvard University; 
Samuel J. Maltby, University of Calgary; Demetrio A. Munoz, Cornell University; 
Akira Negi, University of North Carolina, Chapel Hill; Seth Padowitz, Brown 
University; Andrew Przeworski, Massachusetts Institute of Technology; Philip T. 
Reiss, University of Manitoba; James P. Sarvis, Massachusetts Institute of Tech- 
nology; Kannan Soundararajan, University of Michigan, Ann Arbor; Michael G. 
Szydlo, Boston University; Joe Y. Tien, University of California, Irvine; Mark A. 
Van Raamsdonk, University of British Columbia; Jeffrey D. Wall, Princeton 
University; Kelly Lynne Wieand, University of Wisconsin, Madison; Erick B. 
Wong, Simon Fraser University; and Zhaohui Zhang, Yale University. 

The other individudls who achieved ranks among the top 98, in alphabetical 
order of their schools, were: Brigham Young University, John Wesley Robertson; 
University of British Columbia, David L. Savitt; Brown University, Andrew Brecher; 
California Institute of Technology, Steven C. Anderson; University of California, 
Berkeley, Daniel C. Isaksen; University of Colorado, Boulder, Steve T. Soulé; 
Cornell University, Jon M. Kleinberg; Dartmouth College, Radu Bacioiu; Duke 
University, Alexander J. Hartemink; Harvard University, Manjul Bhargava, Joseph 
I. Chuang, Michael L. Hutchings, Dimitri Kountourogiannis, Paul Li, Matteo J. 
Paris, Chris Ternoey; Harvey Mudd College, Jon H. Leonard; University of Maine, 
Orono, YuQun Chen; Massachusetts Institute of Technology, Jerome S. Khohayt- 
ing, Tichomir G. Tenev, William W. Tucker; Memorial University of Newfound- 
land, Robert P. Gallant; Michigan State University, Thomas P. Hayes; University 
of Minnesota, Minneapolis, Matthew P. Kelly; Université de Montréal, Marc-André 
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Lafortune; New York University, Mikhail Kogan; Ohio State University, Frank J. 
Swenton; University of Pennsylvania, Frosti Petursson; Princeton University, Tibor 
Beke, Mark W. Lucianovic; Purdue University, Pok-Yin Yu; Rice University, 
Donald A. Barkauskas; Rose Hulman Institute of Technology, Jonathan E. Atkins; 
Stanford University, Daniel P. Cory, Garrett R. Vargas; Texas A & M University, 
Zheng-Zheng Li; University of Waterloo, Dorian Birsan, Kevin K. Cheung, Jie J. 
Lou; Wellesley College, Yihao L. Zhang; West Virginia Wesleyan College, Emanuel 
V. Todorov; and Yale University, Matthew Frank. 

The Elizabeth Lowell Putnam Prize, named for the wife of William Lowell 
Putnam and to be “awarded periodically to a woman whose performance on the 
Competition has been deemed particularly meritorious’, is awarded this year for 
the first time to Dana Pascovici of Dartmouth College. The winner is awarded a 
prize of $500. 

There were 2421 individual contestants from 393 colleges and universities in 
Canada and the United States in the competition of December 5, 1992. Teams 
were entered by 284 institutions. 

The Questions Committee for the fifty-third competition consisted of George E. 
Andrews (Chair), George T. Gilbert, and Eugene Luks; they composed the 
problems listed below and were most prominent among those suggesting solutions. 


PROBLEMS 


Problem A-I. 
Prove that f(n) = 1—n is the only integer-valued function defined on the 


integers that satisfies the following conditions: 


Gi) f(f(n)) = n, for all integers n; 
Gi) f(f(n + 2) + 2) =n for all integers n; 
Gii) f(O) = 1. 


Problem A-2. 


Define C(a) to be the coefficient of x!?”* in the power series expansion about 
x =O0of (+ .x)*. Evaluate 


1 1 1 
+ + + +++ +———_ | dy, 
ytl y+2. yt3 y + 1992 | 


fic(-y-)) 


Problem A-3. 


For a given positive integer m, find all triples (n, x, y) of positive integers, with 
n relatively prime to m, which satisfy (x? + y*)” = (xy)”. 
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Problem A-4. 


Let f be an infinitely differentiable real-valued function defined on the real 


numbers. If 
j : = n=1,2,3 
| | 2 1 ’ oh et a 


compute the values of the derivatives f(0), k = 1,2,3,.... 
Problem A-5. 
For each positive integer n, let 


Q if the number of 1’s in the binary representation of n is even, 
a, = 
” 1 if the number of 1’s in the binary representation of n 1s odd. 


Show that there do not exist positive integers k and m such that 


Problem A-6. 


Four points are chosen at random on the surface of a sphere. What is the 
probability that the center of the sphere lies inside the tetrahedron whose vertices 
are at the four points? (It is understood that each point is independently chosen 
relative to a uniform distribution on the sphere.) 

Problem B-1, 


Let S be a set of n distinct real numbers. Let A, be the set of numbers that 
occur as averages of two distinct elements of S. For a given n > 2, what is the 
smallest possible number of distinct elements in A,? 


Problem B-2. 


For nonnegative integers n and k, define Q(n, k) to be the coefficient of x* in 
the expansion of (1 + x + x? + x3)". Prove that 


“[n n 
O(n K) = Lil - ai): 
where (<} is the standard binomial coefficient. (Reminder: For integers a and b 


with a = UY, =a! Na — Oo)!) torU <b <a, an = VU otherwise. 
ith 0, (5 1 /(b\( b)!) for 0 < b d (*) = 0 otherwise.) 


Problem B-3. 


For any pair (x, y) of real numbers, a sequence (a,(x, y)),.9 is defined as 
follows: 


ao(x,y) =x, 


, foralln > 0. 


ayy) = SC 


Find the area of the region {(x, y) | (a,(x, y)),,59 converges}. 
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Problem B-4. 


Let p(x) be a nonzero polynomial of degree less than 1992 having no noncon- 
stant factor in common with x* — x. Let 


i992 | p(*) | — f(x) 


dx1992\ ~3 —~y] g(x) 


for polynomials f(x) and g(x). Find the smallest possible degree of f(x). 


Problem B-S. 


Let D, denote the value of the (n — 1) x (mn — 1) determinant 


3 911 1 1 
1411 1 
115 1 1 
111 6 1 
11d toc nti 


Is the set {D,,/n!},. , bounded? 


Problem B-6. 
Let 4 be a set of real nm X n matrices such that 


(i) I € &@, where I is the n X n identity matrix; 

Gi) if A € # and B € @, then either AB € .4 or —AB € 4, but not both; 
Gii) if A € #4 and B € @, then either AB = BA or AB = —BA; 

(iv) if A € #4 and A ¥ I, there is at least one B € 4 such that AB = —BA. 


Prove that .4 contains at most n* matrices. 


SOLUTIONS 


In the 12-tuples (7149, 29,..., M9, N_,) following each problem number below, n, 
for 10 > i > O is the number of students among the top 203 contestants achieving i 
points for the problem and n_, is the number of those not submitting solutions. 


A — 1 (31, 82, 42, 10, 0, 0, 0, 7, 23, 6, 2, 0) 


Solution. If f(n)=1-—n, then f(f(n)) =fd—-—n)=1-QA-n) =n, so @ 
holds. Similarly, f(f(m + 2) + 2) = f(—n — 1) + 2) = fl — n) = 20, so Gi) holds. 
Clearly (iii) holds, and so f(m) = 1 — n satisfies the conditions. 

Conversely, suppose f satisfies the three given conditions. From condition (ii), 
fff + 2) + 2)) = f(n), and applying G) yields f(m + 2) + 2 = f(n) or f(n + 
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2) = f(n) — 2. An easy induction yields 


_ {f(0) — 1 if n is even, 
MM) = Ve) 41 =n ifn isodd. 


If f(0) = 1, then f(1) = 0 by (i), therefore, f(n) = 1 — n. 
A-2 (157, 1, 0, 0, 0, 0, 0, 0, 2, 14, 14, 15) 


Solution. From the binomial series, we see that 


_ Cy= I(-y = 2) + (ey = 1992) 


C(-y-1 
(~y— 1) 1992! 
(y+ 1)(y + 2)--+-(y + 1992) 
7 1992! 
Therefore, 
1 1 
C(-y - 1) + —— +-:: +—_— 
y+1 yt2 y + 1992 


— d ((y + 1)(y + 2) +++ (y + 1992) 
-=| 1992! | 


Hence the integral in question is 
1d ((y+1)(y + 2)-°- (Cy + 1992) F (y+1)(y + 2)---(y + 1992) 
j dy 1992! a 1992! 
= 1993 — 1 = 1992. 


1 


0 


A-3 (55, 20, 7, 0, 0, 0, 0, 0, 16, 7, 45, 53) 


Solution. There are no solutions if m is odd. If m is even, the only solution 1s 
(n, x,y) =(m 4+ 1,277,272), 

If (n, x, y) is a solution, then by the arithmetic-mean—geometric-mean inequal- 
ity, (xy)” = (x? + y*)"™ > (2xy)”", so n > m. Let p be a prime number. Let a and 
b be the largest powers of p that divide x and y, respectively. Then the largest 
power of p dividing (xy)” is (a + b)n. If a < b, the largest power of p dividing 
(x* + y*)” is 2am. But this implies that (a + b)n = 2am, and this contradicts 
n >m. Similarly, the.assumption a > b leads to a contradiction. Therefore a = b 
for all primes p, and we conclude that x = y. Thus, the equation reduces to 
(2x*)” =x", or equivalently, x7%”~™ = 2”. It follows that x is a positive power 
of 2, say 2%. This implies 2(n — m)a = m, or, 2an = (2a + 1)m. Since gcd(m, n) 
= gcd(2a,2a + 1) =1, we must have m=2a and n=2a+1. Thus, m is 
necessarily even and the solution follows as claimed. 


A-4 (17, 6, 7, 0, 0, 0, 2, 0, 73, 18, 47, 33) 
Solution. We will show that 


f(0) = (-1)*/*k! if k is even, 
0 if k is odd. 
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First we note that if h(x) is a differentiable function and x,,x,,..., is a 
sequence strictly decreasing to 0 such that h(x,) = 0, then by Rolle’s Theorem, 
there exists a sequence y,, y,,..., strictly decreasing to 0, such that h'(y,) = 0 
(Xn + < Yn < Xn). 

Now let g(x) = f(x) — 1/01 + x”). Then g(1/n) = 0 for n = 1,2,... . Apply- 


ing the result of the preceding paragraph to g,g’,g”,... and invoking the 
continuity of g“ at 0, we see that g“ (0) = 0 for k = 0,1,2,3,... . Thus, 
d* 
(0) = —_ | ——; ; 
f ( ) dx* | 1 4 x2 <0 


The Maclaurin series for 1/(1 + x”) is LZ_,(—1)*x**, and hence f 0) is 
equal to the values given above. 


A-5 (1, 9, 1, 0, 0, 0, 0, 0, 5, 3, 72, 112) 


Solution. Observe that a,, =a, and a,,,, =1-—a,,=1-a4,,. 

Suppose that there exist k,m as above, and we may assume m is minimal for 
such. 

Suppose first that m is odd. We'll suppose a, = 4;4,, = Ap49m = 9, as it will be 
clear that the case a, = 1 can be treated similarly. Since either k or k +m is 
EVEN, p41 = Apa a1 = Akeome1 = 1. Again, since either k +1 ork+m-+1 is 
EVEN, Apan = Apa mar = Uk+2m+2 = 0. By this means, we see that the terms 
Dp Ap 41> Apa, +++5Ap4m—1, alternate between 0 and 1. Then since m — 1 is even, 
Opam—1 = 4k+am_-1 = Uk+3m_-1 = 0. But, since either k +m-—1o0rk+2m—1 
is even, that would imply that a,,,, = 4,4 ,, = 1, a contradiction. 

Thus, m must be even. Extracting the terms with even indices in 


Ap+j = €@etm4+j ~ *k+2m4)> for0 <j <m —1, 
and using the fact that a, = a,,, for even r, we get 
Ark 214i = URs24¢m/2)+i = Uk/2]+m+ir forO <i < (m/2) — 1. 


(The even numbers > k are 2[k/2],2[k/2] + 2,... .) This contradicts the mini- 
mality of m. 
Hence, there are no such k and m. 


A-6 (9, 3, 4, 0, 0, 0, 0, 0, 0, 10, 32, 22, 123) 


Solution. Recall first that if points A, B,C, D are in general position in 3-space, 
then a point £ lies inside the tetrahedron ABCD if and only if the barycentric 
coordinates of E with respect to A, B,C, D are positive. That is, if we (uniquely) 
express 


E=wA+xB+yC+2zD, withhwt+xt+yt+z=1, 


(the arrows indicating consideration of the coordinate triples as vectors), then E is 
in the interior of ABCD if and only if w > 0, x > 0, y > 0, and z > QO. Hence, if 
E is the origin, then E is in the interior of ABCD if and only if there is a solution 
(w, x, y, z) to 


0=w4+xB+yC+2zD (1) 


1993] FIFTY-THIRD PUTNAM COMPETITION 761 


with w, x, y, z having the same sign. As the solution space to (1) is 1-dimensional, 
this condition holds for one nonzero solution if and only if it holds for all. 

Now assume that the center of the sphere is located at the origin and fix the 
first chosen point P on the sphere as the north pole, the other three points, 
P,, P,,P3, then being random. 

We may suppose the choice of each P. is made in two steps, the first choosing a 
random diameter Q;Q;, and the second choosing at random between the end- 
points Q;, Q;.. Since the 2? = 8 possible selections of endpoints of the three 
diameters are equally likely, each of the 8 tetrahedra PQ, j,22),23;,9 Jj, =1 or 
2, are equally likely. We may further suppose that the vertices of each of these 
tetrahedra are in general position as the probability of degeneracy is 0. Similarly, 
we may suppose that the center of the sphere does not lie on any face of the 
tetrahedra. 

Let (w, x, y, z) be a nonzero solution to the equation 


0 = wP + xQi, + YQo, + 2Q31. 


— 


Then, since 0, = —Q,,, the eight equations 
0 = wP + xQ;,, + yQo;, + ZQ3,, 
have respective solutions 
(w,x,y,Z),(w,x,y¥,—Z),(w, x, -Y,Z),(W, —x,y,Z), 
(w, x, —y, —z),(", —X, YS, z),(w, X,Y, —z),(w, —X, —Y, —Zz). 


Hence, exactly one of the eight equations has a solution whose coordinates have 
the same sign. 

It follows that exactly one of these 8 equally likely tetrahedra contains the 
center. Thus the probability of including the center is 1/8 for all initial choices of 
3 diameters. We conclude that the probability for a random tetrahedron is 1/8. 


B-1 (145, 15, 4, 0, 0, 0, 0, 0, 6, 14, 11, 8) 


Solution. The smallest possible number of elements in A, is 2n — 3. 


Let x, <x, <---+: <x, represent the elements of S. Then 
X, +X, xX, +x; X, +X, AX +X, %34+%x, 
———= << —— << — KC KX 
2 2 2 2 2 
X,-y +X, 
< eee ———— 
2 


represent (m — 1) + (n — 2) = 2n — 3 distinct elements of Ay, so Ag has at least 
2n — 3 distinct elements. 

On the other hand, if we take S = {1,2,...,n}, the elements of A, are 
3,4,3,..., 54. There are only (2n — 1) — 2 = 2n — 3 such numbers; thus there 
is a set A, with at most 2n — 3 distinct elements. This completes the proof. 
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B-2 (159, 10, 7, 0, 0, 0, 0, 0, 1, 4, 13, 9) 


Solution. We have 


(L+x +x? 4x3)" 


L, O(n, k)x" 


k>=0 


(1 + x?)'(1 +x)" 


-E(i}ez( "I 


J=0 i=0 


zzet 


J>O0i>o 


= Yx>d Mien 


k>0 j=0 


Comparing coefficients of x*, we derive the desired result. 
B-3 (23, 11, 10, 0, 0, 0, 0, 0, 27, 24, 71, 37) 


Solution. The area is 4 + 7. The region of convergence is 


namely, a (closed) square {(x, y)| — 1 <x, y < 1} of side 2 with (closed) semicir- 
cles of radius 1 centered at (+ 1,0) described on two opposite sides. 

If lim, ,..a,(x, y) = L, then L must satisfy L = (L? + y*)/2; that is, L must 
be a root of the equation | 

r?—2rt+y’?=0. (1) 
In such case, the equation must have real roots, so the discriminant, 4 — 4y’, must 
be nonnegative. Thus, a necessary condition for (a,(x, y)) to converge is that 
ly] <1. 

Fix ly| < 1. The roots of (1) are then 1 — ¥1 —y? and 1+ y1—y?, which 
are real and nonnegative. As a,(—x, y) = a,(x, y), the interval of convergence is 
symmetric about x = 0. We shall assume then that x > 0; thus, a,(x, y) => 0, for 
all n. 

Ifr75=1+tyl1- y”, then a, (x, y) is less than, equal to, or greater than r, 
according to whether a,(x, y) is less than, equal to, or greater than ry (= 
(rf + y7)/2). 
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If a,(x, y) lies in the closed interval [1 — y1—y?,1+ y¥1-—y7*], that is, 
between the roots of (1), then 


a,(x,y) — 2a,(x,y) +y? <0, 


1-—yl-y’ <a,,,(x,y) <a,(x,y). 
It follows that (a,(x, y)),.9 converges if x is in the closed interval 


[1-— yl —y?,1+ yl — y7]. 


If a,(x, y) does not lie in the interval [1 - ¥1 —y*,1+ 1 —y7], then 


so that 


2 
a,(x,y) — 2a,(x,y) +y*>0, 
so that 
Ansi(X,y) > a,( x,y). 


Thus, if x, and therefore all a,(x, y), are greater than 1+ y1-—y?, then the 
sequence diverges. On the other hand, if x, and therefore all a,(x, y), 
lie between 0 and 1- yl —- y”, the sequence converges monotonically to 
1-—yl- y?. 

To summarize, (a,(x, y)),,5 9 converges if and only if 


-l<y<l 


—(1+ yl-y?)<x<1+ 1—y?. 


B-4 (35, 11, 13, 0, 0, 0, 0, 0, 12, 5, 48, 79) 


and 


Solution. The smallest possible degree of f(x) is 3984. 

By the Division Algorithm, we can write p(x) = (x? — x)q(x) + r(x), where 
q(x) and r(x) are polynomials, the degree of r(x) is less than 3, and the degree of 
q(x) is less than 1989. Then 


qi?? p(x) qd? r( x) 
dx'9 | 73 — x dx \ 3 —x I" 
Now, write r(x)/(x? — x) in the form 
A B C 
+—+ . 
x—-1 x x+1 

Because p(x) and x*? —x have no nonconstant common factor, neither do r(x) 
and x* — x, and therefore, ABC # 0. Thus, 


(ee 


dy 1992 | 3 — x 
B C 


i +] 4] 
(x _ 1) 1993 (x + 1)" 


= 190 
x 


19“ +1) + BOx — IP P(e $ IY + C(x - 12) x 


(<3 — x) 
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Since ABC # Q, it is clear that the numerator and denominator have no common 
factor. Expanding the numerator yields an expression of the form 


(A+ B+ C)x398 + 1993(.4 — C) x35 + 1993(996.4 — B + 996C) x34 4 +++ | 


From A =C=1, B= —2, we see the degree can be as low as 3984. A lower 
degree would imply 4 + B+ C=0, A —C=0, 996A — B + 996C = 0, imply- 
ing that A = B = C = 0, a contradiction. 


B-5 (62, 4, 4, 0, 0, 0, 0, 3, 6, 2, 49, 73) 


Solution 1. The set {D,/n!},., forms a sequence which strictly increases to 
infinity; it is therefore unbounded. 

Observing that D, = 3 and D, = 11, we obtain a recursion for D,,, ,. Subtract- 
ing the next-to-last column from the last column and then the next-to-last row 
from the last row, one finds 


3 1 1 1 (0) 

1 4 1 1 (0) 

1 1 §5 1 (0) 
D., = det ; , 

1 1 1 1 n+l —n 

0 0 O (0) —n 2n+1 


Expanding the determinant in its last row, one obtains 
D4, = (2n + 1)D, — n?D,_,. 


Letting r, = (D,,/n!), the recursion may be written as 


2n-+1 n 

r = r,— ——Yr,_,, 

n+l n+-1i1”™ n+17! 
or 

n 

(Tyo n) Prern n ~'n—-1) 

We conclude that 
r —r= rr, —'5) = , 


Therefore, 


Mngi =. + (13 — 12) + (14 — 173) Ft Hn — Mn) 


so the sequence (r,,) diverges to infinity. 
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Solution 2. The problem is the case a; =i + 1 of 


1+a, 1 1 1 1 
1 1+a, 1 1 1 
1 1+a, 1 1 
D+41(@1,---,4,) = det 1 1 1 lta, 1 
1 1 1 1 1+a, 
4) 4) 4) 
= []Ja,+ » Ila, 
i=] i=1J=1 
j#i 
This formula follows immediately from the recurrence 

D,41(@1,---,4,) = a,D,(a1,...,4,-1) + a,_1D,(a1,..-,@,-2,9). 


To prove this recurrence, subtract the (n — 1)st column from the nth column, and 
then expand along the nth column. 


If none of the a,’s equal 0, we can write the polynomial D,(a,,...,a,_,) in the 
form 
D 1 1 1 1 
pee GA, = en + — + — feee H 
n( Ay . i) 414) On-4 a; a» An} 

It follows that 
D, 1 1 1 
— =14+-4+-4+°:'4+-, 
n! 2 3 n 


so the sequence (D,,/n!) is unbounded. 
B-6 (0,0, 0, 0, 0, 0, 0, 0, 5, 4, 39, 155) 


Solution 1. We prove the result more generally for complex matrices (because it is 
convenient to use i = /— 1 in the proof). 

The proof is by induction on n. 

If n = 1 then the elements of -4@ commute so that (iv) cannot be satisfied unless 
= {I}. Suppose that n > 1 and that the result holds for sets of complex matrices 
of smaller dimension. 

We may assume |.4| > 1, so by (iv), there exist C, D € 4 with CD = —DC. 
Fix such C, D. As in the first solution, C? = + J. Hence the eigenvalues of C are 
+A where A =1 or i. Furthermore, C? =V, @V_,, where V,,V_, are the 
nullspaces of (C — AI),(C + AI) respectively. We observe that if X € 4 then 


CX =XC > (CHAI)X =X(CHAI) PVLAX HV); 
CX = -XC > (CHAI)X =(-1IX(C FAI) SVL AX = Vey. 
In particular, since V.D = V_,, dim(V,) = dim(V_,) = n/2. 
Let Y={X €.4|CX = XC, DX = XD). If YEAH then exactly one of 
Y, YC, YD, YCD is in %. It follows that |“| = |4|/4. 
For X € Y, let dCX) be the n/2 X n/2 matrix representing, with respect to a 
fixed basis of V,, the linear transformation given by v — vX for v € V,. Then ¢ is 


injective. To see this: assume d(X) = d(Y) so that vX¥ = vY for v € V,; but if 
v €V_, then vD € V,, so that vX¥D = vDX = vVDY = vYD, which again implies 
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vuX = vY; since X,Y induce the same transformations of both V, and V_,, it 
follows that X = Y. 

If suffices finally to show that d(./), a set of n/2 X n/2 complex matrices, 
satisfies (i), (ii), (iii), (iv), for then, by induction, |¢(.”)| < (n/2)?, whence |.4| = 
4|4V|= 416%) <n’. 

Conditions (i), Gi), Gii) for d(”) are clearly inherited from those of 4. To 
show (iv), let 6(A) € d(”), with CA) not the n/2 X n/2 identity matrix. Then 
A #1 (as @ is injective) and AB = —BA for some B € 4. Let B’ be the element 
of {B, BC, BD, BCD} belonging to ™ Since AB’ = —B’A, d(A)d(B’) = 
— &(B')d(A). 


Solution 2. Let G be the group {+A |A ©}. We must show that |G| < 2n7. 

The center of G, Z(G), consists of +J, and if X @€G\ Z(G), then X has 
precisely two conjugates, namely itself and —X. Thus G has 1 + |G|/2 conjugacy 
classes, and therefore, G has 1 + |G|/2 inequivalent irreducible representations 
over C, 

The number of inequivalent representations of dimension 1 is |G/G’|, where 
G’' is the commutator subgroup. Since G’ = {+/} = Z(G), this number is |G| /2. 

The remaining irreducible representation then has dimension y|G|/2 (since 
the sum of the squares of the dimensions of the irreducible representations is |G|). 
This representation must be contained in the given representation of G inn Xn 
matrices, for in all the 1-dimensional representations, Z(G) is in the kernel. Hence 


n> yiG|/2, or 2n? > |G\. 


Klosinski: Alexanderson: 

Department of Mathematics Department of Mathematics 
Santa Clara University Santa Clara University 
Santa Clara, CA 95053 Santa Clara, CA 95053 
Larson: 


Department of Mathematics 
St. Olaf College 
Northfield, MN 55057 


Professor H. B. Fring, of 
Princeton University, was fatally 
injured by an automobile on the 
evening of Friday, December 21 


and died about one a.m. on De- 
cember 22, 1928. He was seventy 
years of age. 
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A Visual Explanation 
of Jensen’s Inequality 


Tristan Needham 


“This theorem is so fundamental that we propose to give a number of proofs, of 
varying degrees of simplicity and generality.” So say Hardy, Littlewood, and Polya 
({1], p. 17) of the theorem of the arithmetic and geometric means. True to their 
word, they proceed to give eleven (!) different proofs of the fact that for non- 
negative x,, 


nh X, +X, °° TX, 
qe = (MERA a) a) 
n 
with equality iff x, =x,= :-: = x,. For elegant applications (suitable for the 


classroom) of this result to elementary geometry, see [2]. 

One of the simplest proofs of (1) consists in recognizing it to be merely a special 
case of Jensen’s inequality [3]. This widely used result (e.g., probability theory [4]) 
states that if the graph of a real continuous function f(x) is concave down then 


Bae). (E) ° 


with equality iff the x’s are all equal. If the graph is concave up, the inequality is 
reversed. To obtain (1) we need only put f(x) = In x and note that its graph is 
concave down. Very neat, but where did (2) come from? This note describes a 
particularly simple way of seeing its truth, which we hope may be of value in the 
classroom. Indeed, we believe it could even be used successfully in high schools. 

We have given no formal definition of a graph being “concave down,” and when 
presenting the following argument to young students we shall suppose that none 
will be given; what matters is that they know what one looks like. With more 
mature students we may define the graph of f to be “concave down” if the region 
{(x, y): y < f(x)} below the graph is convex. This is not one of the standard 
definitions, but it is a visually compelling inference from any other reasonable 
definition. 

Consider a set of n point particles in the plane, of equal mass and with position 
vectors r;. The center of mass therefore has position vector 


1 
c= n Yr, 
from which it follows easily that 
L(r;— ¢) = 0. 


In other words (see FiGure 1), the vectors from c to the particles cancel. 
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Figure 1 


Imagining pegs sticking out of the plane at the locations of the particles, stretch 
a rubber band so as to enclose all the pegs. When released, the rubber band will 
contract into the dashed polygon H of FiGure 2. This is the ‘‘convex hull’ of the 
set of particles. The key point is this: c must lie in the shaded interior of H. For if p 
is outside this set, we see that the vectors from p to the particles cannot possibly 
cancel, as they must do for c. More formally, we take it as visually evident that 
through any exterior point p we may draw a line L such that H and its shaded 
interior lie entirely on one side of L. [Alternatively, this property may be taken as 
a (non-standard) definition of a convexity for a closed planar set.] The impossibility 
of the vectors cancelling now follows from their lying entirely on this side of L, for 
they all must have positive components in the direction of the normal vector n. 
Except when the particles are collinear (in which case H collapses to a line-seg- 
ment), the same reasoning forbids c from lying on H. 


Figure 2 


Next, suppose that the particles are distributed along a convex curve K. See 
FiGure 3. The shaded interior of H now lies entirely on the concave side of K, 
and consequently so too must c. Furthermore, we see that c can only lie on K in 
the degenerate case that all the particles coalesce. Finally, take K to be the graph 
of a function f(x). If this graph is concave down [up], then c lies below [above] K. 
Thus, with the particles located at (x;, f[x,]), we conclude that if the graph is 
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Figure 3 


concave down, 


f(x) =") 


= height of c < height of q = f | 
n 


with equality iff the x’s are all equal. If the graph is concave up, the inequality is 
simply reversed. 

As a bonus, observe that c must also lie on or above the dashed chord 
connecting the two end particles. Thus if y = g(x) is the equation of this chord, 
we obtain 


(=| = height of r < height of c = PIC) 


I do not know if this result has a name. 
We note that the above ideas can be generalized in at least two directions: 


(1) The positive masses m, of the particles need not be equal for the argument to 
work. Thus, once again taking the graph to be concave down, 


Lm; f(x;) Lm; x; 
onal < f| ——— |, 
h. M 
where M denotes the total mass. This is essentially the form that is used in 


probability theory, for we are free to interpret (m,;/M) as a probability distribution 
for x,, yielding 


E|f(x)] <f(@[x])), 
where & stands for the expected value. Also, by allowing the number of particles 


to increase without limit, we may pass from a discrete probability distribution to a 
continuous one. 


(2) The argument is equally applicable to a set of particles in three-dimensional 
space. Thus, taking these particles (of equal mass, say) to be distributed over a 
surface z = f(x, y) that is concave down, we deduce that 


Lf (% yi) Lx; al 


’ 
n n 


-| 
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Of course this too may be generalized to unequal masses and be given a probabilis- 
tic interpretation. 

I do not wish to claim that the above is more original than it really is. In 
particular, the argument associated with FiGureE 2 is very old; I merely rediscov- 
ered it. The first important application of this idea that I know of occurred in 1874 
when F. Lucas used it (see [5]) to demonstrate a complex analogue of Rolle’s 
theorem: the critical points of a polynomial in the complex plane must all lie within 
the convex hull of its zeros. This follows from FiGureE 2 by observing [Gauss, 1816] 
that if P(z) is the factorized polynomial, the conjugate of the logarithmic deriva- 
tive [P’(z)/P(z)] is a weighted sum of vectors from z to the zeros. 

Also, consideration of centers of mass is certainly not new in the context of 
Jensen’s inequality, and thus it is hard to believe that so simple a line of thought 
can have escaped notice. Nevertheless, it would appear that in the literature (e.g., 
[1], p. 71) the location of the center of mass is merely used as an interpretation of 
(2), rather than as the source of an explanation. 
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The Index of a Constrained 
Critical Point 


Catherine Hassell and Elmer Rees 


1. INTRODUCTION. This note deals with the problem of determining the type of 
a critical point arising in the method of Lagrange multipliers. This method is the 
usual one used to solve the following problem: 

To find the critical points of a smooth function f defined on M” CR"™”, a 
smooth submanifold given as the common zero-set of m smooth functions 
g: RT" OR. 

The method consists of introducing a vector A = (A,,A,,...,A,,) of ‘unde- 
termined multipliers’, defining L to be f+A-g =f + L%™,A;g, and finding its 
critical points. The question of deciding the non-degeneracy and type of a critical 
point is usually disregarded in the text books or else dismissed as being too 
complicated. Our purpose is to show, on the contrary, that criteria can be stated 
and derived in a straightforward manner. 

We compare the Hessian of f restricted to M with the bordered Hessian, that 
is, the Hessian of L regarded as a function of n + 2m variables (including A). The 
two Hessians have the same nullity at corresponding critical points and when they 
are non-degenerate, they have the same signature. 


2. LAGRANGE MULTIPLIERS AND THE BORDERED HESSIAN. Let 
UcR"t™ be an open subset and g: U->R™” be a C!?-function such that 
Dg(a): R?*” > R™ has rank m for every a € M = {x © Ulg(x) = c}. Hence, by 
the implicit function theorem [F, p. 117], M is a smooth n-dimensional manifold. 
We wish to determine the critical points of the function f,; M — R which is the 
restriction of a C?-function f: U > R. 

For A € R”, we consider the Lagrangian 


L=f+a-(g-c) 


either as a function of x € U or as a function of (x,A) € U X R™. The critical 
points are obtained by solving the equations 


VL=0 and g=c 


or, equivalently 
VL =0 


regarding L as a function of (x, A). 

To determine the nature of a critical point a of f, one could study the Taylor 
series of f, at a in terms of local coordinates on M. Let H,,f(a) be the Hessian 
form of f,; it is the symmetric bilinear form on the tangent space T,(M) which 
represents the quadratic terms in the Taylor expansion of f,. If Xyeces x, are 
local coordinates on M near a, the entries of the matrix of Hj, f(a) are (07f/dx; 0x,;) 
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evaluated at a. If this matrix is non-singular, the form H,,f(a) is called non- 
degenerate and f, is called a Morse function at a. In this case the nature of the 
critical point is determined by the algebraic properties of H,,f(a). The index of 
the critical point, that is the number of independent directions in which f, 
decreases, is determined by the signature of the form H,,f(a). We will give 
practical methods for determining when H,, f(a) is non-degenerate and for calcu- 
lating its signature. 

Let g be a C*-function and let HL(a, A) be the bordered Hessian of L at the 
critical point (a, A) of L; that is, the Hessian of L regarded as a bilinear form on 
Tig, yU X R™) = R"**™. TE 


8 = (81, 825+++> Bm) 


let Dg’ = (Vg,, Vg>,..., Vg,,) denote the (transposed) Jacobian matrix of g at a. 
Then the matrix of the bordered Hessian HL(a, A) is the (n + 2m) by (n + 2m) 
symmetric matrix 


Hf +A ' Hg Dg" 
bg (0) 


where A is evaluated from the equation 
0=-VL=Vf+aA-:Dg 


at a. 


3. THE MAIN RESULT. If a symmetric bilinear form on a real vector space is 
represented by the matrix H; then its nullity is the dimension of the kernel of H 
and its signature is p — gq, where p and q denote the number of positive and 
negative eigenvalues of H respectively. 


Theorem L. The nullity of Hy f(a) equals the nullity of HL(a, 4). 
If HL(a, A) Cand hence Hy f(a)) is non-degenerate, then the signature of H,, f(a) 
equals that of HL(a, A). 


This theorem follows from the purely algebraic Theorem 2, using Taylor’s 
formula and the implicit function theorem [F]. When it is applied to critical points 
as above, it yields the following result. 


Corollary. The point a € M is a critical point of f,; M > R if and only if (a, A) is a 
critical point of L: U X R”™ — R. In this case, a is non-degenerate if and only if 
(a, A) is non-degenerate and the index I(f,,a) of f, at a is related to the index 
ICL, a, A) of L at (a, A) by 


I(f,,a) +m=I1(L,a,A). 


So, for example, a is a local minimum of f, if (a, A) is a non-degenerate critical 
point of L of index m. 
Similarly, a is a local maximum of f, if (a, A) is a non-degenerate critical point 


of L of index n + m. 


Theorem 2. Let C = ‘ | be the symmetric real matrix consisting of the (n + m) 
xX (n + m) symmetric matrix A, the m X (n + m) matrix B of rank m and the zero 
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m X m matrix 0. The symmetric bilinear form induced on Ker B by A is denoted b. 
Then the bilinear form on R"*?” defined by C is isomorphic to b ® H where H is the 


2m-dimensional hyperbolic form !? i]. 


The proof that we give for this theorem is considerably simpler than our original 
one and is based on a proof provided for us by Dr. A. A. Ranicki. 
First, we give proofs of some facts from linear algebra that we need. 


Fact 1 (Fredholm alternative) [HK, p. 103]. Let P’: R* — R’ be the transpose of 
the matrix P: R’ > R*, then 


(Ker P)~ = Im P?. 


Proof: Let r denote rank P. Then dimKer P =/—r hence dim(Ker P)* =r. 
Also dimIm P? = r. It is therefore enough to show that Im P’ Cc (Ker P)+ ite. 
Im P? 1 Ker P. 


Suppose x € R* and z € Ker P then 
z:Plx =z'P'’x =0 since Pz = 0. 


Let b be a bilinear form on the real finite dimensional space V, the annihilator 
of a subspace U CV is U+ = {x € V|b(x, u) = 0 Vu € U}. 


Fact 2. Let b be a non-degenerate symmetric bilinear form on V of dimension 2m 
and let Wc V have dimension m and Wc W~-. Then b is represented by the 
matrix !° ‘|. 

Proof: Choose w # 0 in W and v such that b(w, v) = 1. Let u = v — blu, v)w/2, 
then b(u, u) = 0 and {u, w} is the required basis in the case m = 1. When m > 1, 
let U = Span{u, w} and consider U ~-, this contains a subspace that is self-annihilat- 
ing and of dimension m — 1. The result follows by induction. 

We also make use of the principal axis theorem. 


Fact 3 (HK, p. 266]. If b is a symmetric bilinear form on an n-dimensional real 
inner product space then there is an orthonormal basis {e,, e,,...,¢,} such that 
b(e;,e;) = 6,,a; for some a; € R. 


=i? =j 


Proof of Theorem 2: If W denotes the m-dimensional subspace 


0 
—|:ye R™\c Ritem 
y = 


then by Fact 1 applied to B, there is a canonical orthogonal decomposition 


R’+2” = Ker B © Im B! 0 W. 


By Fact 3, one can choose an orthonormal basis 
{e,,...,e,} for Ker B 


such that b(e;,e;) =a, and b(e,,e;)=0 for i#j. Then for 1 <i<n, e; is a 


=i i? <j 


vector whose component in W is zero. Since e, € Ker B one has Ce, € W~ and 
hence Ce, = k; + B'f, where k; € Ker B and f, € R” = W is unique because B’ 


i 
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is one-to-one. Moreover, k; = a,e; because e/Ce, = 5,,a;. Hence, 
_ T 
Ce; = aje,; + B fi 
= aye, + Cf; 


Define K = Span{e; — f;: 1 <i <n}. 
The following steps will prove Theorem 2. 


Step 1. K and Im B’ @ W are orthogonal with respect to C. 
Step 2. The form defined by C on K is isomorphic to b. 
Step 3. The form defined by C on Im B? @ W is hyperbolic. 


Proof of Step 1: Since C(e; — f;) = a;e; € Ker B, one has that C(K) C Ker B and 
Ker B is orthogonal to Im B’ @ W. 


Proof of Step 2: Since (e; — f;)"C(e; — f,) = a;6;;, one has that C|K is isomorphic 
to b. : : 


Proof of Step 3: By choosing a basis for the image of B’, one can take the matrix 
of C to have the following form 


A, AS 0 
A, A; Lin 
0 -, 0 


A, In 
I, 0 


and so is equivalent to a hyperbolic form by Fact 2, since W is a self-annihilating 
subspace of dimension m. 


4. COMPARISON WITH CLASSICAL CRITERIA. In the literature there are 
criteria for deciding when a critical point is a local maximum or minimum, for 
example [H] or [G]. Here we show how these criteria are related to our result. 


Criterion 1. Let C= ‘ Bt be as in Theorem 2 and assume that the last m kX m 


submatrix of B is non-singular, then the form induced by A on Ker B is positive 
definite if the determinants A; for 0 <i <n have sign (—1)” where A, = det C; 
and C;, is obtained from C by deleting its first 7 rows and columns. 


Proof: Write C, = | +" Pn . Then A, = det C, = (-1)(det B,)’, so sign A, = 


B 
(—1)” since B, is non-singular. By Fact 2, C, is hyperbolic and so has index m. 


n 


The proof is completed by using induction based on the following: 


Lemma. Let H be a non-singular symmetric real matrix and H, be obtained from H 
by deleting one row and the corresponding column. If H, is also non-singular and 
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index H is the number of negative eigenvalues of H then 


index H, 


index fT = index H, + 1 


depending on whether det H and det H, have the same or the opposite sign. 


Proof: Let M and M, be maximal negative definite subspaces for H and H, 
respectively. 
Recall that the dimension of a maximal negative definite subspace is unique. 
Clearly, dim M, < dim M < dim M, + 1. 
Also sign det H = (—1)"™™ and signdet H, = (-1)9™™.. 
Hence det H and det H, have the same sign = dim M = dim M, as required. 


Another criterion discovered in the 19th century is the following (see [H] for the 
historical references). 


Criterion 2. Let “ a be as in Theorem 2. Then the form induced by A on Ker B 


is positive definite if and only if the roots of 


A-—tl B 
det = 
| B A 0 


are all positive. 

Note that the above equation is of degree n. 

The stronger result that the roots of the above equation are the eigenvalues of 
the form A restricted to Ker B with the same multiplicities is an immediate 
consequence of Theorem 2 applied to the matrix 


450 BT 
B 0 


when ¢ is a root of the above equation. 


5. EXAMPLES 


3 on the surface x7! + 


1. To find the critical points of f(x, y,z) =x? +y?+z 
y !+z7! = 1. (This example is taken from [G, p. 94)]). 
Let LD=xe+y%+z7+ACx tty -t+2z7!- 1) then (OL /dx) = 3x* — Ax? 


etc. and the bordered Hessian is 


6x + 2Ax73 0 0 —x~? 


0 6y + 2Ay 3 0 —y? 
0 0 6z+2Az > -z? 
—x~? —y? —72 0 


The critical points are given by 
xtayta=zt=A/3andx t+y +2 '=1. 
These are x =y =z = 3, A = 248; x =y=1, z= —-1, A =3 and two other 


solutions symmetrical with the latter. 
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In the first case the Hessian is 


36 0 0 —9"! 

0 36 0 —g"! 

0 0 36 —9"! 
—9-t -g-t -—97! 0 


which is non-degenerate and has signature 2, so the critical point has index 0; that 
is, it is a non-degenerate minimum. 
In the second case the Hessian is 


-1 -1 -l 0 


which is non-degenerate and has signature 0. So this critical point (and the other 
two symmetrical with it) has index 1; that is, it is a saddle point. 


2. Consider the quadratic form x’Ax on the sphere x’x = 1 in R”. Critical points 
x are given by 
Ax —Ax =0 


i.e. by an eigenvector x with eigenvalue A. The critical point is non-degenerate if 
A-Al 
x? 


eigenvalue has multiplicity r > 1, let x,,...,x, be a basis for the eigenspace, then 


0 is non-singular and this is true if the eigenvalue has multiplicity 1. If the 


is non-singular. The corresponding critical submanifold is a great sphere of 
dimension r — 1 and is non-degenerate in Bott’s sense [B]. We recall this concept 
briefly. Let S be a connected critical submanifold for a function f; it is called 
non-degenerate if for every x © S, the Hessian is non-degenerate normal to S, Le. 
the Hessian is zero on 7,S and the induced form on T,M/T,S is non-degenerate. 
If M” CR"*” is defined by g = c, then the bordered Hessian is 


Hf+2A-Hg Dg! 
AL A) = ~ n 
(a,A) Dg 0 
If {e,,...,@,} is a basis for T,S, e; © R”*™, then S is non-degenerate at a if 


Hf+A-+Hg Dg’ E7 


Dg 0 0 
E 0 0 
is non-singular, where E’ is the matrix (e,,€5,...,e,). The signature of this 


enlarged bordered Hessian determines the index of the critical submanifold in the 
same way as our main result deals with non-degenerate critical points. 
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A Characterization of Euclidean Spaces 


In connection with the article “A Characterization of Inner Product 
Spaces” by Neil Falkner (this Monthly 100(1993), 246-249) it might 
be worth noting that inner product spaces over the reals are 
characterized by the validity of the Converse Theorem of Pythagoras. 
The latter, namely that the smaller sides of a triangle which fulfills 


the famous Pythagorean relation a* + b* =c 


2 are orthogonal, is 


often assumed without proof as for instance in the argument about 
the legendary rope-stretchers of Ancient Egypt, who are said to have 
used a triangle with sides 3, 4, 5 to construct a right angle. 


In the notions of, an inner product space we have ||x + yll* = |lx|l° 


+ 2re(x, y) + |lyll*. So the Theorem of Pythagoras and its converse 
are obvious in the case of a real inner product space. However, in 
any complex inner product space (with the exception of the trivial 


space {0}, which is no real complex space anyway) we may take x #0 
and y = ix such that (x, y) = illx||? # 0, but still [x + yl]? = |]x||? 
+ |ly|l? holds true. 

The fact that‘in Euclid’s Elements the Theorem of Pythagoras 
(1.47) is followed by the Converse Theorem of Pythagoras (1.48) and 
its proof is another justification for calling inner product spaces over 
the reals Euclidean spaces. 
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On Some Irrational Decimal Fractions 


Norbert Hegyvari 


It is known that the decimal fraction 
a = 0.235711131719... 

is irrational, where the sequence of digits is formed by the primes in ascending 
order. In [1, Th. 138] there are two different proofs for this statement. The first 
uses a special case of the Dirichlet’s theorem, namely: any arithmetical progression 
of the form 10°*!k + 1 (k = 1,2,---) contains primes. In the second proof it is 
assumed that there is a prime between N and 10N for every N > 0, which is the 
special case of the Bertrand’s Postulate. Similar proofs are found in [2]. 

In this article we will give a direct proof for this statement. We prove even 
more. 


Theorem. Let 1 < a, <a, < ... bea sequence of integers for which U?_,1/a; = ©. 
Then the decimal fraction a = 0: (a,)(a,)...(a,)... is irrational. 


Since Li?_,1/p; = ©, where p, <p, < ... is the sequence of primes, we immedi- 
ately get the original version of the statement. 


Definition. Let B be a block of digits b,b,...b, with s >1 and 0 <b, < 9 for 
i= 1,2,...,s5. Let n be a positive integer re c,10*~ with c, # 0. The integer n 
is said to contain the block of digits B if for some j > 0 we have c;,,; = b; for 
every i = 1,2,...,5. For example, the integer 1402857 contains the blocks 14 and 
0285 (among others), but not the blocks 014 or 582. 


Lemma. If X = X(b,,b,,...,6,) denotes the sequence of positive integers not 
containing the block of digits b,b,...b,, then LF = y1/n is convergent. 


We mention that the Lemma is a generalization of a well-known exercise (see 
[1, Th 144]). 


Proof of the Lemma: Let s, = 1/x, + 1/x, +...1/%, and let ¢ be an integer for 
which x,_, < 10° < x,. Then we have 
S, <1/x,+1/e, +...4+1/x, + 10%(1/[%,4,/10°] +... +1/[x,/10°]). 


We note that if t <i <n, then[x,/10°] is a member of X, say x;. Also, since the 
block b,b,...b, appears in at least one of 10° consecutive integers, it follows that 
for any fixed x, there are at most 10° — 1 values of x, such that [x;/10°] = x,, and 
we have 


t t 
s, < )/ 1/x; + (10° — 1)107%s, or s, < 105: ) x, 
i=] i=1 


which proves the lemma. 
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Proof of the Theorem: Assume that q@ is a rational number. Thus @ is a periodic 
decimal, with a block of digits, say b,b,...b,, repeating endlessly perhaps after an 
initial first block. If B is a block of 1’s, define c,c,...c,, to be a block of 2’s of 
length 2s; otherwise define c,c,...c,, to be a block of 1’s of length 2s. Now 
define Y = Y(c,,c,,...,€>,) aS the sequence of natural numbers not containing 
the block of digits c,c,...cz,. If we write 


» 1/a; = » l/a + » l/a, 
i=1 acY aky 


then by the Lemma the first sum on the right side converges, and hence the second 
sum diverges. This implies that there are infinitely many a, that contain the block 
of digits c,c,...C,,. This in turn implies that B cannot be a repeating block of 
digits in a. This contradiction establishes the Theorem. 


ACKNOWLEDGMENT. The author would like to thank the referee for a number of suggestions and 
for detecting some flaws in our original version. 
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Professor Florian Cajori died sud- 
denly of pneumonia on August 14, 
1930, at his home in Berkeley, Cali- 
fornia. He was a charter member of 
the Mathematical Association of 
America and was one cf an original 
group of four (later enlarged to 
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of the American Mathematical 
Monthly on a sound financial basis. 
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researches will be published in the 
Monthly in due course. 
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NOTES 


Edited by: John Duncan 


The Symmetry Principle for Mobius 
Transformations 


Louis Brickman 


With precise definitions to come below, the symmetry principle is the following. 


Theorem. Let E be a circle or extended line. Let T be a Mobius transformation. Let 
zand z* be symmetric points with respect to E. Then T(z) and T(z*) are symmetric 
with respect to T(E) (which is also a circle or extended line). 


Can the discussion be both rigorous and intuitively satisfying? The key is that 
the theorem is more about “conjugate M6bius transformations” than ordinary 
Mobius transformations. Indeed, the theorem holds for either type of transforma- 
tion, whereas the symmetry concept involves only the former. To prepare the proof 
we need only set down the composition relationships between the two types 
(Lemma 1), and then show that each circle or extended line determines a unique 
and very special conjugate M6bius transformation (Lemma 2). 

Many of the standard proofs establish the conclusion separately for special 
transformations such as translations, inversions, and dilations. The results are then 
combined in a composition argument. Another well known approach depends 
upon the concept of cross ratio. The method here seems simplest. 


Preliminary Definitions. The complex plane and extended complex plane are 
denoted by C and C, respectively (C = C U {«}). A Mobius transformation is a 
map T: C => C defined by 


az +b 
Iz) = TG (a,b,c,d,€ C; ad — bc #0). 
The formula is extended by continuity for z = © and, if c #0, for z= —d/c. 


With each such T we associate the conjugate Mobius transformation T:€ 39 € 
defined by 


T(z) =T(z) (®=>%). 


Finally, we let .@ be the set (actually “group’’) of all Mébius transformations and 
M&M ve the set of all conjugate Mébius transformations. 

We remark (without proof) that our first lemma is equivalent to the statement 
that WU @ is a group, and A and W@ are the cosets of the normal subgroup .Z. 


Lemma 1. Let S,T €.@. Then 
(A)T°SEMW, (2)T°SEM, (3)T°SEM, (4T°SEZ. 
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Proof: Conclusion (1) is standard. Then (3) follows immediately because T° S 


— T°§. Once (2) is proved, (4) follows from the fact that ToS = T° S. Thus we 
need only prove (2). With T as described above, 


aS(z) +b 
cS(z) +d 


aS(z) +b] 


(T° S)(z) = T(5(z)) = Sz) Fd 


Conclusion (2) now follows from (1). 


Lemma 2. For each circle or extended line E, there is a unique T € & such that 
E= {z EC: T(z) =z}. 

(E is exactly the set of fixed points of T.) This T is an involution of C; that is, TT is 

the identity. 


Proof: A circle described by |z — a| = r(r > 0) is equivalently described by 


r2 


+ a. 


z= T(z) === 


A line {a + bt: t € R}(b # 0) has the equation 


z—-—a Z-a\” 7 b 
b [—) >. on e= T= 5 


Since T(c) = , the extended line {a + bt: t € R} U {} is exactly the set of fixed 
points of T. 

For uniqueness suppose T(z) = z = T,(z) for all z ona circle or extended line. 
Then 7,(z) = T,(z) for more than 2 values of z. Hence T, = T, and T, = 7). 
(The uniqueness of T for an extended line may be surprising in view of the fact 
that a and b are not uniquely determined by the extended line.) 

For the involution proof we note that T oT € .#/ (Lemma 1, part (4)) and has all 
the points of E as fixed points. 


Definition. The transformation T described in LEMMA 2 is called reflection in E, 
and will be denoted by pz. If confusion seems unlikely, p,,(z) is denoted simply by 
z*(z © C). Also, z and z* are said to be symmetric with respect to E. 

Now that reflection is solidly defined it is easy to prove the theorem. With 
obvious changes the proof applies equally well to conjugate Mobius transforma- 
tions. 


Proof of Symmetry Principle: In precise terms we must show that 


Pre(T(z)) =T(pe(z)) (z EO), 
Or 
Prey? T = T° pg. 


But both sides of the last equation belong to (Lemma 1, parts (2) and (3)), and 
they agree everywhere on E. Therefore we are finished. 
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A Short Proof for Romberg Integration 


T. von Petersdorff 


The Romberg extrapolation method for numerical integration is discussed in most 
numerical analysis textbooks. We give a short proof for the convergence rates of 
the Romberg extrapolations without using the Euler-Maclaurin formula. 

The Romberg method starts with the sequence of values 7,, of the composite 


trapezoid rule with N = 1,2,4,8,... subintervals which converges to the exact 
integral with a rate of O(N“). By using linear combinations of the values T,, new 
sequences Ty ,,Ty.2,... are constructed which converge to the exact integral with 


the rates of O(N‘), O(N~®),... for N > ©. We will see that the gain of two 
powers of N with each extrapolation step is due to the symmetry of the trapezoid 
rule. 

The classical proof of the Romberg method on an interval uses the Euler- 
Maclaurin formula to derive an asymptotic expansion of the error of the composite 
trapezoid rule (e.g., [2]). The convergence rate of the Romberg extrapolations then 
follows from this expansion and the fact that it contains only even powers of N. 

The proof of the Euler-Maclaurin formula is elementary. But the proof is based 
on certain recursion properties of the Bernoulli polynomials and it is not intuitively 
obvious what it is that makes the Romberg method work. 

The convergence properties of the Romberg method can be understood by 
using homogeneity and symmetry principles, see e.g. [1] and the references given 
there. Here we want to give a simple proof which only uses these two basic 
principles (and Taylor’s theorem). We will only derive the convergence rates of the 
extrapolated values based on the sequence of 1,2, 4,8,... subintervals. We do not 
obtain a general asymptotic expansion or formulae for the constants in the 
estimates. For results of this type see [2], [1] and the references given there. 


THE ROMBERG INTEGRATION METHOD. Let f be a continuous function on 
the interval [a, b], and let I(f) = {?f(x) dx. The trapezoid rule on [a, b] is defined 


by 
T'*°l f) = 3(b — a)( f(a) + f(d)) 


and the composite trapezoid rule with N subintervals on [a, b] is given by 
~N b—a N-1 
——| f(a) + f(b) +2 Lo f(%,) 
k=1 


TIMP) = YL Peron fy = 3 

k=1 N 
where x, =a +k(b —a)/N, k =0,...,N. The Romberg extrapolations are de- 
fined recursively by 


Qemt+27 s f _T in f) 
Tro f) = T° (f), To¢,m+if) = =a) ar DD 


for integers k > 1, m => 0. The convergence is given by the following theorem: 


Theorem 1. Let f be 2m +2 times continuously differentiable on [a,b]. Let 
N = 2"n with positive integers m,n. Then 


Tv, m(£) —1(F)| < m(b ay" max | FO"*PCE)In-Cm*? (1) 
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Here the constant C,,, is independent of f, a, b, N. 


THE PROOF. We consider an integration rule on [—1,1] of the form 


J 
A(f) = Lf(&)»; (2) 
j=l 
with certain nodes €; © R and weights w, € R, j = 1,..., J. The corresponding 


rule for the interval [a,b] is given by Al™°\f) = 4(b — a)A(f) where f(x) = 
f(a + b) + (b — a)x)/2). We will use the following theorem which is a standard 
result in numerical analysis textbooks and follows from Taylor’s theorem. 


Theorem 2. Assume the integration rule (2) is exact for all polynomials of degree less 
than or equal to r, let f be r + 1 times continuously differentiable. Then the composite 
rule A °\ f) = DN_, Al**-»"(f) satisfies for all positive integers N 


AWC) —I(f)| <C(b- a)" max | fOtP(E)|N-OT? 
ée[a, b] 
where the constant C is independent from f, a, b, N. 
We now make the following assumption about A(f): 
Assumption 1. Let the integration rule A(f ) on [—1, 1] be symmetric with respect to 
0, i.e., ACf) = ACf) where f(x) = f(—x), for all continuous f. Furthermore, let the 


rule A(f ) be exact for all polynomials of degree less than or equal to q with some even 
number q. 


Then we have 
Proposition 1. The rule A(f ) is also exact for all polynomials of degree q + 1. 
Proof: The function xt! is odd, hence A(x?t!) = O and {1 ,x?*' dx = 0. 

Now we consider the composite rule A,(f(x)) = $A(f(x — 1)/72)) + $ACfCx 


+ 1)/2)) with two subintervals on [—1,1] and denote the quadrature errors by 


E(f) = A(f) — I(f), Ex(f) = A,(f) — ICf). Then 


1 x —1)\%"? 1 
Ex(x*) = 52 5 | + 5k 


by expanding the powers and using Proposition 1. Therefore we can integrate x7*? 
exactly with the rule 


x+1 
2 


| = 2B) (3) 


27*24,( f) — A(f) 


24+2 —] (4) 


A(f) = 


Hence this construction implies: 


Proposition 2. The rule A(f) is symmetric and exact for all polynomials of degree 
less than or equal to q + 2. 


Now Theorem 1 follows by induction: Let A°(f) = T'~''(f). Obviously this 
rule satisfies Assumption 1 with g = 0. Assume that the rule A”(f) satisfies 
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Assumption 1 with gq = 2m. Define the rule A”*'(f) = A™f) using (4) with 
q = 2m. By Proposition 2, A’”*'(f) satisfies Assumption 1 with g = 2m + 2. 

By Proposition 1, the rule A”(f) is actually exact for all polynomials of degree 
less than or equal to 2m + 1. Finally note that for N = 2’ we have T), ,,(f) = 
(A™)% > f) where (A™)I* lf) denotes the composite rule on [a,b] with n 
subintervals which is based on the rule A”. Therefore Theorem 2 implies (1). 

Note Theorem 1 remains true if we replace the trapezoid rule by any other 
symmetric rule. 


Remark. Romberg integration on triangles can be treated in a similar way: Here 
the basic rule T uses the function values at the three vertices of the triangle, this 
rule is exact for polynomials of total degree one or less. For a rule A on a triangle 
we define the composite rule A,, by dividing the triangle in N* congruent smaller 
triangles and applying the basic rule A on each subtriangle. Assume that the rule 
A is exact for all polynomials of total degree q or less with gq even. Let FE and E, 
be the integration errors of A and A). If f,,, and f,,. are monomials of total 
degree g + 1 and g + 2, respectively, then we obtain 


E4( fo+1) — 2-44 OF ( fo41), E3( fa+2) — 2-44 OR ( fo 45). (5) 


Hence the rule A defined by (4) will be exact for polynomials of total degree q + 2 
or less. To prove (5), consider the triangle with vertices (0, 0), (1,0) and (0, 1). Then 
proceed analogously as in (3) and note that one of the four subtriangles is rotated 
by 180 degrees. Therefore one of the four terms arising from E,(f,,,) has the 
opposite sign. For E,( fa ,2) expand the arising terms in monomials of degree 
q+ 2, q+ 1, and lower order terms. Then the terms of order g + 1 will cancel 
each other since the central subtriangle is rotated by 180 degrees. As the rule 
A° = T is exact for polynomials of degree zero, induction shows that the rule A” 
is exact for polynomials of total degree 2m or less. Hence the Romberg extrapola- 
tions Ty ,, converge with order O(N-©”*). This is one order lower than in the 
one-dimensional case, and this result cannot be improved. But no symmetry of the 
underlying rule T is required for this argument, so T can be any quadrature rule 
which is exact for the function 1. 
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An Elementary Proof that the Borromean 
Rings Are Non-Splittable 


Ollie Nanyes 


Linstr6m and Zetterstr6m [1] gave a proof that the Borromean rings (figure 1) 
could not consist of true circles. In this note, we give an elementary proof (sans 
algebraic topology) that the Borromean rings are “linked” though no two compo- 
nents are. The tool that we use is the colorability modn of a knot or link diagram. 
This tool has been presented in honors undergraduate seminars. I have included a 
discussion of colorability modn though the technique is well known. For example, 
see Kauffman, Chapter VI [2]. 


1. DEFINITIONS. A knot will be defined as a smooth (or polyhedral) simple 
closed curve in 3-space R°. A link is defined as a collection of disjoint smooth (or 
polyhedral) simple closed curves in R°. Two knots or links K, and K, are said to 
be equivalent if there is an orientation preserving homeomorphism h: R°* —> R° 
such that h(K,) = K,. A link L is said to be splittable if there exists a smooth (or 
polyhedral) 3-ball B, an ordering of the components of the link K,, K,,..., K,, 
and an integer 0 < k < m such that K, C B for j < k and K, C S’ — B fori > k. 
A diagram for a knot or link K is an image of a regular projection (all 
self-intersections are non-tangential (transverse) and are double points) of K onto 
a plane with crossing information at each double point (p. 215, reference 2). Note 
that FicureEs 1 and 4 are examples of diagrams. Two knot or link diagrams D, and 
D, are said to be equivalent if D, can be obtained from D, by: 

1) Deformations of the plane which do not alter the crossing information at 
each double point and 

2) The three Reidemeister moves and their inverses. See FiGure 2 for an 
illustration of these. 


Figure 1 
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OWRD) OO 


R1 R2 


R3 


Figure 2. The Reidemeister Moves 


2. THEOREMS. The following theorem is well known and will not be proved 
here. 


Theorem 1. Two knots or links are equivalent if and only if they have equivalent 
diagrams. See section 1B of reference [4] for a proof. 


A knot or a link K is said to be colorable mod n (n is assumed to be 3 or 
greater) if K has a diagram D in which it is possible to assign an integer to each 
arc of D which does not contain an undercrossing of D such that: 

1) at each crossing we have a + c = 2b (mod n) where b is the integer assigned 
to the overcrossing and a and c are the integers assigned to the other two arcs (see 
FIGURE 3) and 

2) at least 2 distinct integers mod n are used in the diagram. 


The following theorem is well known: 


Theorem 2. Jf K, is a knot or a link which is colorable mod n then every diagram of 
K, is colorable mod n. 


Proof: Exercise. All one has to check is: if a diagram D is colorable mod n and if 
one applies either a Reidemeister move (or its inverse) to D, the resulting diagram 
remains colorable modn. O 


It follows from Theorem 1 and Theorem 2 that if K, is a knot or a link which is 
colorable mod n and K, is equivalent to K,, then K, is colorable mod n. 


Corollary 3. There exists a knot which is not equivalent to the unknot. 


Proof: Note that the trefoil knot (see FiGurE 4) is colorable mod 3 whereas the 
unknot is not. O 
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We now come to the main result of this note: 
Theorem 5. If a link L is splittable then L is colorable mod 3. 


Proof: If L is splittable with a splitting ball B, then there exists a diagram for L in 
which the images of L % B are separated from the images of L M (S? — B) by a 
circle C. Give the components of the diagram of L \ B the monochrome coloring 
by assigning the integer 0 to each strand. Similarly, assign the strands of the 
diagram of L 1 (S? — B) the integer 1. O 


It is an exercise to see that the standard diagram of the Borromean rings is not 
colorable mod n for any n > 1. The integer labeling of the diagram depicted in 
Ficur_E 1 illustrates this: one has no choice but to set a = b. Thus we have an 
elementary proof that the Borromean rings link is unsplittable and thus the rings 
cannot be pulled apart. 


Remark. If a knot or link K is colorable mod n, then one can obtain a homomor- 
phism from 7 ,(R? — K) onto the dihedral group D,, where D, = {s, tls? =1 =", 
sts = t"~*}. This homomorphism is determined by the particular choice of color- 
ing. See Kaufman [2] or Fox [5]. 


Figure 3 


Figure 4 
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Letter to the Editor: 


Recently Grosof and Taiani [1] gave an algebraic proof that if 
Q(X) =TT}(X — r;) with the r; distinct, then YP(7;)/Q’(r,;) = 0 for 
deg( P) < n — 2. I should like to add that this result has a home in 
algebraic number theory, as part of the computation of the “different’’. 
The usual proof there [2, p. 135; 3, p.56; 4, p.144] is yet another 
ingenious algebraic argument. First, standard methods yield the 
partial fraction decomposition 


1/Q(X) = LO'(r,)7'/CX — 7,). 
The right-hand side, as a formal power series in X™', is 


X-'LO'r,) + /G — 1,X7) 
= Y[O'(r) rE Xe tt 


But the left-hand side is 
(X" + a,X”"~} +...)7! 
= X~"(1 + a,x"! +...)7! 
=X" —a, XM 4... 


Comparing terms, we recover the fact that Dr*/Q’(r,) = 0 for 
k <n —2; we also see that the sum is equal to 1 when k = n — 1. 
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UNSOLVED PROBLEMS 


Edited by: Richard Guy and Richard Nowakowski 


In this department the MONTHLY presents easily stated unsolved problems dealing 
with notions ordinarily encountered in undergraduate mathematics. Each problem 
should be accompanied by relevant references (if any are known to the author) and by 
a brief description of known partial or related results. Typescripts should be sent to 


Richard Guy, Department of Mathematics & Statistics, The University of Calgary, 
Alberta, Canada T2N IN4. 


Open Problems in Pattern Avoidance 


James Currie 


INTRODUCTION. What makes a mathematical area interesting? The area should 
contain a range of open problems: some very concrete and approachable, others 
“bigger”. These days it might help for the area to tie in with chaos and fractals. 
Finally, it couldn’t hurt for someone to offer cash for solutions to problems in the 
area. 

A word w over an alphabet > is nonrepetitive (or squarefree) if no two adjacent 
blocks in w are identical. For example, the word v = abcacb is nonrepetitive. On 
the other hand, the word u = abcbcd is repetitive, since bc occurs next to itself in 
u. Early in this century the Norwegian number theorist Axel Thue showed that 
arbitrarily long nonrepetitive words can be formed using only three letters [25]. 
Since an infinite tree with finite branching must contain an infinite path, one can 
also find “infinite words” on three letters which are non-repetitive. We refer to 
these “infinite words” as w-words. 

Thue’s result has been rediscovered and republished a dozen times or more. 
One reason for this sequence of rediscoveries is that nonrepetitive sequences have 
been used to construct counterexamples in many areas of mathematics: ergodic 
theory, formal language theory, universal algebra and group theory, for example 
[16, 12, 6, 22]. 


WORDS AVOIDING PATTERNS. A word is a finite sequence of elements of some 
finite set >. We call the set > an alphabet, the elements of > letters. The set of all 
words over > is written >*. We take a naive view of words as strings of symbols; 
thus the concatenation of two words w and v, written wv, is simply the string 
consisting of the letters of w followed by the letters of v. The empty word, with no 
letters, is denoted by e. 

Let S and T be alphabets. A substitution h: S* — T* is a function generated 
by its values on S. That is, suppose w € S*, w=wyw,...w, with w, © S, i= 
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1,...,n. Then h(w) = h(w, )h(w,)... h(w,,). We do not allow h(w,) = e for any i. 
As an example, we could give a substitution h: {1,2,3}* — {1,2,3}* by AC) = 123, 
h(2) = 13, h(3) = 2. In this case, h(123) = h()A(2)AG) = 123132. 

A nonrepetitive word over > is said to avoid xx; it cannot be written ah(xx)b 
where a,b € >* and h: {x}* > >* is a substitution. Thue also showed that 
arbitrarily long cubefree words on two letters exist [25]. Such words avoid xxx in 
the sense that they cannot be written ah(xxx)b. The infinite cubefree word 
discovered by Thue is referred to as the Morse-Thue sequence, and is an 
important example in symbolic dynamics [16, 21]. Symbolic dynamics is a key tool 
for studying chaos. 

Before posing our problems, we need a bit more background. Let w and p be 
words. We say that w contains pattern p if we can write w = ah(p)b for words a 
and b, and some substitution h. Otherwise, we say that w avoids p. Let a pattern 
p be fixed. Let & be an alphabet with k letters. If there are arbitrarily long words 
over > avoiding p, we say that p is avoidable on >. Clearly, only the number k of 
letters in > is significant here, so we also say that p is avoidable on k letters. 

For example, xx is avoidable on 3 letters. We say that p is unavoidable if there 
is no k for which p is avoidable on k letters. For example, xyx is unavoidable. 
According to a pretty result of Zimin [26], a pattern p on n letters is avoidable if 
and only if Z, avoids p, where Z, is the word on {1,2,...,} defined by Z, = 1, 
Z, = Z,-\nZ,_1, n > 1. However, no method is known to determine the smallest 
alphabet on which p is avoidable [2, 3]. In [2], a word U, is given which is 
avoidable on 4 letters, but not on 3. Perhaps all avoidable words are avoidable on 4 
letters. 

A word w is strongly nonrepetitive if no two adjacent blocks in w are permuta- 
tions of each other. For example, u = 512341231416 is not strongly nonrepetitive 
since the adjacent blocks 12341 and 23141 are permutations of each other. Let p 
be a word over an alphabet %, p =p,p,...D,,p;& >, i= 1,...,n. Say that a 
word w strongly avoids p if we cannot write w = ap,p,...p,b where a,b are 
words, the p; are nonempty words, and ); is a permutation of p, whenever p; = pj. 
Thus a word is strongly nonrepetitive if and only if it strongly avoids xx. 

It was known for some time that xx is strongly avoidable on 5 letters, but not on 
3 letters [23]. It has recently been shown that xx is strongly avoidable on 4 letters 
[19]. On the other hand, the smallest alphabet on which xxx can be strongly 
avoided is the 3 letter alphabet and the smallest alphabet on which xxxx can be 
strongly avoided is the 2 letter alphabet [10]. 

Let a = a,a,a,a,... and b = b,b,b,b,... be w-words over some alphabet 2, 
with a;,b, € }. Define the distance between a and b to be p(a, b) = (1/k) where 
k= min{i ENla, + bj}. Thus the longer a and b go on agreeing, the closer 
together a and b are. Let L be the set of nonrepetitive w-words over > = {1, 2, 3}. 
With respect to the metric p, L has no isolated points; for any nonrepetitive 
w-word a over >, we can find distinct nonrepetitive w-words over > agreeing with 
a to as many places as desired [24]. It follows that L is a Cantor set. 


Concrete Problems 


1. Is there a pattern w which is avoidable on 5 letters but not on 4 letters? [2] 

2. Let L be the set of nonrepetitive words over the 3 letter alphabet {1, 2, 3}. It 
is known that c(n), the number of words of L of length n, grows exponen- 
tially [4]. Give an exact enumeration for L. For the solution to this problem 
I offer US$100. 
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3. 


It is known [15] that the set of cubefree w-words over a 2-letter alphabet is 
uncountable. Is the set a Cantor set? 


“Bigger’’ Problems 


1. 


Is there an algorithm which decides, given a pattern p and a natural 
number k, whether p is avoidable on k letters? [3] If so, give such an 
algorithm. I offer US$100 for the solution to this problem. 

Define strongly avoidable in the obvious way. Is there an algorithm which 
decides, given a pattern p, whether p is strongly avoidable? If so, give such 
an algorithm. Again, US$100 to the solver of this problem. 

Is there an algorithm which decides, given a pattern p and a natural 
number k, whether p is strongly avoidable on k letters? If so, give such an 
algorithm. I offer US$100 for the solution to this problem. 

For US$100, decide the following conjecture: If pattern p is avoidable on 
>, then the set of w-words on & avoiding p is a Cantor set. 

For US$100, decide the following conjecture: If the smallest alphabet on 
which p is avoidable is {1,2,..., k}, then there exists a natural number m, 
and substitutions f: {1,2,..., m}* — {1,2,...,k}* and g: {1,2,...,m}* > 
{1,2,..., m}* such that f(g”(1)) avoids p for every n EN. 
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Serendipity 

After reading the letter to the 
editor from R. Norwood [The 
last math journal, Amer. Math. 
Month., 1993, p. 491-2], I won- 
der what will happen to those of 
us who enjoy mathematics, do 
not have a computer and like to 
read on public transportation. 
How will one be able to browse 
through various items including 
The American Mathematical 


Monthly at one’s leisure and 
above all come across the most 
interesting articles which are al- 
ways next to those one had 
planned to read? Will serendipity 
end? 


A. M. Herzberg 
Department of Mathematics 
and Statistics 
Queen’s University 
Kingston, K7L 3N6 
CANADA 
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and correct. Of course, an elegant partial solution or a method leading to a more 
general result is always useful and welcome. In addition, references to other 
appearances of MONTHLY problems or to solutions of these problems in the 
literature are also solicited. 


PROBLEMS 


10330. Proposed by R. Bruce Richter, Carleton University, Ottawa, Ontario, Canada, 
and Josef Sira4n, Technical University of Bratislava, Bratislava, Slovakia. 


Let n and k be given positive integers. Define q,r,s,t to be the unique 
integers such that n = qk +r=s(k +1) +t, with O<r<k and O<t<k. 
Show that 

qd 
2 


10331. Proposed by Carl Pomerance, University of Georgia, Athens, GA. 


k+rq> [5 )e + 1) +o 


Find all positive integers n such that n! is multiply perfect; i.e., a divisor of the 
sum of its positive divisors. 


10332. Proposed by Kiran S. Kedlaya, student, Harvard University, Cambridge, 
MA. 
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If n and K are integers with 0 < k <n, prove that 
2n _ n—-k—2j| 7% n —J 
Le | d? ies 


10333. Proposed by Michael Golomb, Purdue University, West Lafayette, IN. 


For a positive integer n with 2" <n < 2*t!, let L(n) = 2% (k = 0,1,2,...). 
Let S(n) be the sum of the binary digits of n. 


1 


(a) Evaluate > T2(n)S(n) 


n>1 


1 
b) Show that ——_§— di . 
(b) Show tha x L(n)S(n) iverges 


1 
(c) Show that )> 


————.——— converges for every 6 > 0. 
nay L(n)S'*°(n) 


10334. Proposed by John Sarli, California State University, San Bernardino, CA. 


Let M be a fixed n by n matrix with complex entries which is not nilpotent. 
For a,b € C, define the linear operator M, , on the space of n by n complex 
matrices by M, ,(N) = aMN + bNM. If the operators M,, and M, 4 have the 
same characteristic polynomial, show that a* + b* =c* + d* for some k, 1 < 
kK<n. 


10335. Proposed by David Borwein, University of Western Ontario, London, On- 
tario, Canada, and Jonathan Borwein, Simon Fraser University, Burnaby, British 
Columbia, Canada. 


Let r be a positive constant and c, => 0. Consider the iteration 


Ch 


C4, =¢, +¢- ——. 
ylt+e; 


(a) For which values of r does the sequence (c,,) converge? 

(b) In case of convergence to c with c # Co, prove that lim(c,,, — c)/(c, — c) 
exists and determine its value. 

(c) In case of divergence, find an asymptotic expression for c.,. 


10336. Proposed by Ignacy I. Kotlarski, Oklahoma State University, Stillwater, OK. 


Let X,, X,,... be a sequence of independent identically distributed random 
variables, each exponentially distributed with parameter a, a > 0, ie., for k = 
1,2,..., 


0 if x < 0, 


P(X, <x) = (O_pne ifx > 0. 


Let B be a fixed Borel set in [0, ©) such that its Lebesgue measure y,(B) is finite 
and positive. Let 


Y, =X, + oan +X, 
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for k = 1,2,..., and 


6= >) P(Y, EB). 
k=1 
(a) Find @ as a function of a. 
(b) Find a uniform minimum variance unbiased estimator of 6 from a sample 
from the above exponential distribution of a fixed size n. 


10337. Proposed by Horst Alzer, Waldbrol, Germany. 


Let n > 1 be an integer. Let x,,...,x, be real numbers with x, € (0,1/2]. 
Consider the statement 


no X; Uy xy 
[] < at . 
ist l—x;, U1 ~—x;) 
(a) Prove F, for n < 3. 
(b) Show that F, is false for n = 6. 
(c) * What about F, and F,? 


(E,) 


NOTES 


Notes: (10332) The sum may be considered as a sum over all integers j by using 
the convention that a binomial coefficient (<} is zero unless 0 < b <a. (10337) 
The inequality F, was suggested by the related statement 


1/n 
I(#/G-*x))" < Lx/Ld-x), 
with i = 1...n in all sums and products. This statement is true for all nm > 1 under 
the conditions given in the statement of the problem. More information on this 
inequality, due to Ky Fan, can be found in E. F. Beckenbach and R. Bellman, 
Inequalities. 


SOLUTIONS 


Alternating Parity in Chebyshev Systems 


E 3456 [1991, 646]. Proposed by A. S. Cavaretta, Kent State University, Kent, OH. 


Suppose 0 = m) < m, < +:* <™m, are integers such that m, = i (mod 2). 
(i) Prove that a real polynomial 
Co tex + +++ +e,x™, with coc, #0 


has at most m real zeros, each zero being counted according to its multiplicity. 
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(ii) Prove that the generalized Vandermonde determinant 


xg' xy! xn 
xy xy xn 
is non-zero if Xx), x,,...,%X, are any n + 1 distinct real numbers. 


Solution by Thomas Kunkle, College of Charleston, Charleston, SC. Part (i): Let 
p(x) be the polynomial in question. We say that the sequence 


Cor C1, Ca,-- +5 ly (1) 


has a sign change at i (0 <i <n), if, for some k = 1, c;c;,, < 0, and if, for every j 
strictly between i and i + k, c; = 0. By Descartes’s Rule of Signs, the number of 
positive zeros of p(x) (counted according to multiplicity) is at most the number of 
sign changes of (1), and the number of negative zeros is at most the number of sign 
changes of 


Cy, —C1,Co5-+-5(—1)"c,. (2) 


Since p(0) # 0, to prove (i), we need only show that the number of sign changes of 
(1) and that of (2) sum to at most n. 

By i we will always mean a nonnegative integer, strictly less than n, for which 
c, # 0. By definition, a sign change can occur only at such 7. Because cyc,, # 0, for 
every i there exists a kK = k(i) such that c;c;,, # 0, and, for all j strictly between i 
and i + k, c; = 0. If k = 1, then exactly one of (1) and (2) will have a sign change 
at i, and if, instead, k > 1, then both (1) and (2) might have a sign change at i. 
Thus the total number of sign changes is less or equal to the number of 7 for which 
k(i) = 1 plus twice the number of i for which k(i) > 2. This is less or equal to 
Lu ;k(i) = n. This completes the proof of (i). 

Part (ii): Suppose that the determinant is zero, or, equivalently, that there exists 
a nontrivial polynomial 


P(X) =Co tp cyx™ +++ to,x™ 


vanishing at the points x,,...,x,. If cy is not zero, then, by part (i) of this 
problem, the number of real zeros of p(x) cannot exceed max{k: c, # 0} <n, a 
contradiction. If, instead, c, is zero, we set / := min{k: c, # 0}, and rewrite p(x) 
as x” times 
Q(X) = Cp + Cp HIT He HE, I, 

Since at least n of the points xo,...x, are nonzero, q(x) has n real zeros. This 
also contradicts part (i), according to which q(x) has at most max{k — I: c, # 0} < 
n — 1 real zeros. Thus the determinant cannot be zero. 


Editorial comment. After the problem appeared, the proposer learned from E. 
Passow that part (i) had already appeared, with various generalizations, in E. 
Passow, “Alternating parity of Tchebycheff systems,” J. Approx. Theory 9 (1973), 
295-298. Related results are contained in E. Passow, “Extended Chebycheff 
systems on (— ©, «),” SIAM J. Math. Anal. 5 (1974), 762-763. The solver suggested 
G. Polya and G. Szegé, Problems and Theorems in Analysis, Vol. II, Springer-Verlag, 
1972-76 for information on Descartes’s Rule of Signs. 
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Solved also by D. W. Bailey, S.-J. Bang (Korea), D. Callan, R. J. Chapman (U.K.), P. Cizek (student, 
Czech Republic), T. C. Craven, M. Dindos (Slovakia), J. Duemmel, N. J. Fine, F. Flanigan, L. L. 
Gardner, H. W. Guggenheimer, Y. Ikeda, A. A. Jagers (The Netherlands), X. F. Jiang (China), I. 
Kastanas, K. S. Kedlaya (student), D. W. Koster, O. P. Lossers (The Netherlands), R. Martin (student), 
J. S. Muldowney (Canada), R. J. Neuhaus, J. H. Nieto (Venezuela), A. Nijenhuis, A. Pechtl (Germany), 
A. Pedersen (Denmark), F. C. Rembis, J. Rickert, M. Roth & O. Such (Canada), E. T. Wong, and the 
proposer. 


Restricted Block-Walking 


E 3465 [1991, 852]. Proposed by Dragomir Z. Dokovié, University of Waterloo, 
Waterloo, Ontario, Canada. 


Let p, g, m, and n be given non-negative integers. Compute the number of 
sequences of m+n + 1 integers k_,,, kK_44,-+->kK_1, ko, k1,..-,K,_-1,k,, Satis- 
fying | 

(i) -p<k_,<k_,isj< +: <k,<@ 
Gi) k_, <0 <k,. 


—m? 


Solution by William Y. C. Chen, Los Alamos National Laboratory, Los Alamos, 
NM. The answer is 
(” + q 
a | 


m+ p 
m 
Recall that the number of nondecreasing sequences of r integers confined to an 
interval of s integers is (: tr '] (selections of r integers from s types with 
repetitions allowed). Now consider two cases: (1) k, = 0, or (2) ky < —1. In each 


case, the desired sequences are built by solving two selection problems. In Case 
(1), we have 


n+qt+1 
n+1 


m+ p 
mt+1 


—p<k_,< °°: <k_,<90 and O<k,)<k,< °°: <k, <q. 


In Case (2), we have 
—p<k_,< ++: <k_,<k)y<—-1 and O<k,<::: <k, <q. 
In Case (1), we take m elements from p + 1 and n + 1 from q + 1; in Case (2), 


we take m+ 1 elements from p+ 1 and n from q+ 1. Together, we have the 
formula claimed. 


Solved also by S.-J. Bang (Korea), J. C. Binz (Switzerland), D. Callan, R. J. Chapman (U.K.), M. 
Dindos (Slovakia) J. Fukuta (Japan), K. S. Kedlaya (student), E. F. Knapp, A. Nijenhuis, R. B. Richter 
(Canada), A. Tissier (France), M. Vowe (Switzerland), Anchorage Math Solutions Group, National 
Security Agency Problems Group, and the proposer. Two incorrect solutions were received. 


Strong Fixed Points of Permutations 
E 3467 [1991, 853]. Proposed by Todd Feil, Denison University, Granville OH, and 
Gary Kennedy, Oberlin College, Oberlin OH. 


A permutation 7 on the set {1,2,...,n} is said to have j as a strong fixed point 
if m(k) <j for k <j and w(k)>j for k >j. Let h(n) be the number of 
permutations on {1,2,...,} having at least one strong fixed point. Prove that 


2(n — 1)!—(n — 2)!< h(n) < 2(n —- 1)! 
for n> 1. 
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Solution by David Callan, University of Wisconsin, Whitewater, WI. For the lower 
bound, note that 1 and n are strong fixed points for (n — 1)! permutations, and 
(n — 2)! of these have been counted twice. For n < 4, equality holds, since 1 or n 
is a strong fixed point whenever 2 or n — 1 is a strong fixed point. For n > 5, both 
inequalities are strict. 

The permutations that do not fix 1 or m cannot strongly fix 2 or n — 1. We 
bound the contributions for 3 <j <n — 2. The permutations that strongly fix j 
but not 1 or nm permute {1,..., 7 — 1} without fixing 1 and {j + 1,...,n} without 
fixing n; there are [Vj — D!-Gi - 2)! —p!-G -—j - DN = G — 2G — 2)! 
(n —j — 1)" —j — 1)! of these. By comparing successive terms, one notes that 
this is maximized at the extremes. With (n — 4) choices for j, the additional 
contributions are bounded by (n — 4)\(n — 4)\(n — 4)! < (n — 2)!, as required. 


Editorial comment. B. M. M. de Weger found the asymptotic expansion 
2(n — 1)!—(n — 2)!+ 2(n — 3)!4+ 4(n — 4)!+4 22(n — 5)! 
+ 125(n — 6)!+ 834(n — 7)!+ O((n — 8)!) 


for h(n). Although there is no simple exact formula for h(n), the generating 
function ©? _,h(n)x” = F(x)/( + xF(x)), where F(x) = Un!x”", appears in R. P. 
Stanley, Enumerative Combinatorics, Vol. I (Wadsworth and Brooks/Cole 1986), 
Exercise 32b, pages 49 and 61. 


Solved also by R. J. Chapman (U.K.), W. Y. C. Chen, P. Cizek (student, France), R. High, N. 
Komanda, D. W. Koster, O. P. Lossers (The Netherlands), H. M. Marston, I. Praton, R. W. Sheets, A. 
Tissier (France), R. Tschiersch (Germany), D. B. Tyler, K. Wayland, B. M. M. de Weger (The 
Netherlands), National Security Agency Problems Group, University of Wyoming Problem Circle, and 
the proposer. Three incorrect solutions were received. 


Primitive Trigonometric Power Sums 


E 3468 [1991, 853]. Proposed by Curtis Cooper, Central Missouri State University, 
Warrensburg, MO, Robert E. Kennedy, Central Missouri State University, and 
Stanley Rabinowitz, Westford, MA. 


Suppose m and n are positive integers such that all prime factors of n are 
larger than m. 

(a) Prove that 
5 * in?” =| _ $(1) — wn) (20) 
kal n 4m m j’ 
which * denotes summation over integers relatively prime to n. (Here ¢@ and pw 
denote the arithmetic functions of Euler and MObius, respectively.) 

(b) Find a similar formula for 


y. cos?” (kar /n). 
k=1 


Solution by Kevin Ford, student, University of Illinois, Urbana, IL. For part (b) 
we show that 


—") _ oc a) 


7 * 
> cos? | 
k=1 


n ont) + a(n). 
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For part (a), the standard binomial expansion yields 


n n mik/n _ ,—wik/n\2™ 
yt sinen(<") — y*(— 
7 


k=1 k=1 21 


(m+j—(m—j))jikt /n 


(—1) _ — m+ 2m 
po EE oom Je 


m + 


— j{ 2m Drew 
mt+y] 


If j =0, then L%",e?77/*/" = Y*" 1 = h(n). If j #0, the hypotheses of the 
problem imply that (j,m) = 1, hence as k& runs through the set of reduced residues 
modulo n, so does h = jk. In this case, 


I 
i] 
3 
M41 
o—~ 
| 
pod 
— 


n 


xe Qrijk/n _ pi 2rih/n _ 3 e2Tih/n > u(d) 


= h=1 d\(h,n) 
n/d 
_— y u(d) au e2mild/n 
d\n 
1—e 
= 2ri 2rid 
= p(n)er™ + de | ar | 
d>l 
= p(n). 


Splitting off the term for j = 0 first, we see that 


Ee sintm( <2) = SP (2m) AEC + ye (2")) 


n 
— O(n) — MA) (ayy, 
=a in 

To obtain (b), we proceed as in part (a). This gives 


ten n( =) 5*(— 4 eTik/n ) 
= k=1 


n 


| 
+ 
x 
WE 
ae 
3 i) 
+3 
Smee 
“ee 
ts 
ax) 
hO 
3 
= 
™~ 
= 


since 
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Editorial comment. The proposers included a reference to Stanley Rabinowitz, 
“Problem 1463,” Crux Mathematicorum, |1989, 207; 1990, 280], which dealt with a 
similar sum without the restriction to values of k relatively prime to n. In that 
form, one is essentially identifying the constant term of the Fourier series of 
cos’”(x). This could be generalized to a use of the entire Fourier series of this 
function as a Discrete Fourier transform. Brian Conolly supplied a reference to 
B. W. Conolly and I. J. Good, “A table of discrete Fourier transform pairs”, SIAM 
J. Appl. Math., 32 (1977), 810-822, which organizes work on this and similar 
formulas. In this context, the key steps in solving the present problem amount to 
the calculation of the transform of the characteristic function of a reduced set of 
residues modulo n. 


Solved also by J. C. Binz (Switzerland), D. Callan, R. J. Chapman (U.K.), P. Cizek (student, France), 
C. Efthimiou, N. J. Fine, K. S. Kedlaya (student), D. W. Koster, L. E. Mattics, A. Pedersen (Denmark), 
G. Thompson, J. C. Vera Lizcano (Colombia), Anchorage Math Solutions Group, and the proposers. 


An Identity Related to the Landen Transform 


6672 [1991, 862]. Proposed by H. B. Kushner, Nathan S. Kline Institute for 
Psychiatric Research, Orangeburg, NY. 


If a and b are positive real numbers, prove that 


ar /2 _ 
i / {(a cos? @ + bsin? $)(a sin? ¢ + b cos? ¢)} ? dd 
0 
— [@ cos? 6 + b? sin?) ’ dé 
0 
and use it to prove that the integral on the right is unchanged if a and b are 
replaced by (ab)'”* and (a + b)/2, respectively. 


Solution by B. W. Conolly, Cambridge, U.K. Let J(a,b) and I(a,b) be the 
integrals on the left and right, respectively. The substitution tan @ = yb/a tan 0 
shows that J(a, b) = I(a, b). Moreover, the expression under the radical in J(a, b) 
can be written 

(a cos’ @ + bsin* ¢)(asin’ d + bcos? ¢) = aj cos* 2¢ + bj sin? 2¢ 
where a, = (ab)'”* and b, = (a + b)/2. Thus 
I(a,b) = J(a,b) = ["” "(az cos? 2g + bz sin? 2g)" dd 
0 


1 on - 
sf (a? cos? 6, + b?sin?6,)’ d6, (6, = 24) 


wT /2 _ 
= ["""(a? cos? 6, + b} sin? ,) '”” d0, = I(a,,b,). 
0 


Editorial comment. Many solvers included material on the arithmetic-geometric 
mean or elliptic integrals in their proofs. To see the connection with elliptic 


integrals, assume a < b and set k = a/b, k' = V1 — k?. Then 
7 _ 1 T —-1/2 
I(a,b) = ["” (a? cos? 6 + b? sin? 6)” do = ~ | “(1 — k?? cos?) dé 
0 


0 
J m/2 12 2 2 —t/2 J f 
a (1k sin?) “dd = K(k’), 
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where K = K(k) and K’ = K(k’) are the usual complete elliptic integrals of the 
first kind for the modulus k. The Landen Transformation (see [1], [2] or [4]) states 
that 

2Vk 1+k 

1+k 2 KY(K), (1) 
With k = a/b, one can check that (2Vk /(1 + k)) = a,/b,. Many solvers observed 
that I(a,, b,) = I(a, b) follows from the previous two equations. 

Historically, elliptic integrals led to elliptic functions, which in turn led to 
elliptic curves. From the modern point of view, elliptic integrals are periods of an 
elliptic curve. To see how this works, consider the elliptic curve E defined by 
w? = (a* + t*)(b* + t”). One can construct E by gluing together two copies of 
C U {«} which are cut from ia to ib and from —ia to —ib. The cuts allow w(t) to 
be given by a well-defined function on each sheet. A homology basis of E is 
{y1,Y2}, where y, is R U {~} on one copy of C U {x}, and y, consists of the 
segments from —ia to ia on both copies of C U {}. Since 


dt dt 
Ww y(a* + t?)(b? +t’) 


is a nonvanishing holomorphic form on £ (unique up to a constant factor), the 
periods of E are the integrals 


dt oo At 
Pen we Loe Ale 


Y1 


t 


The substitution t = b tan @ shows that {> dt/w = I(a, b), so that 2J(a, b) is a 
period of the elliptic curve E. In terms of complete elliptic integrals of the first 


kind, the periods are 

7 , 
— = — Kk’ —=-—K 
[ w b l Ww b 


We can also reconstruct £ from its periods as follows. The quotient 


[,,dt/w  2iK ety so 
= U3 __s= ={x+ :y > 0}. 
T j, di/w Z Eh={x+yweCy } 


is sometimes called the period of E, and there is a complex analytic isomorphism 
E = C/[1, 7] which can be given explicitly in terms of the Jacobi elliptic functions 
sn, cn and dn. It follows that EF is uniquely determined by 7 modulo the action of 
SL(Q, Z) on b. 

The identity [(a, b) = I(a,, b,) relates E to the elliptic curve E, defined by the 
equation w? = (a? + t?7)(b? + t?). The substitutions used in the solution of the 
problem were tan 6 = ya/b tan ¢@ and 0, = 2¢. To get to the elliptic curves, we 
use t= btané@ and ¢t, = b,tané@,. The resulting change of variables is ¢t, = 
2a,b,t/(a? — t*), which comes from a map of the underlying elliptic curves since 

2a,b,t a,b,w(aj + t’) 
t, = Ww, = ———— 
at? ; (a? - 12)" 
defines a function ®: FE — E,. This map preserves the group structure of & and 
E£, and is an example of what is called an isogeny. 
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The isogeny ©® is the key to the whole story. Since it has degree two, one period 
is preserved while the other is doubled. In fact, we have 2 dt/w = dt,/w,, and 
since ® maps a, to ©, and ~ to 0, we obtain 


oo dt a, at oo at a, dt oo dt 
(t-ce-ct-t-ce. oe 


where one uses t ~ ab/t to justify the second equality. This proves I(a, b) = 
I(a,, b,), which in turn implies the Landen Transform (1). Furthermore, ® takes ia 
to ia,, so that 


af = fr (3) 


and thus the other period is doubled as claimed. One can check that this proves 
the other half of the Landen Transform, namely 


=(1+k)K(k). 


1+k 
If we combine (2) and (3), we see that the period of E, is 


26% dt, /w 2 (6? dt /w 
4fy dt,/w, 4fo dt/w 
There is also a connection with the arithmetic-geometric mean of Gauss (see [1], 
[2] or [3]). We have a, = Vab , b, =(a + b)/2, and if we iterate this construction, 
we obtain 


Ans, = Vand, bn41= = ” n=1,2,... 


Since a and b are positive, these numbers converge to a common limit which is 
denoted u = M(a, b) (one can also let a and b be arbitrary complex numbers, but 
convergence is more complicated in this case—see [3]). Then the identity (a, b) = 
I(a,, b,) implies 


I(a,b) = I(a,,b,) = I(a2, 62) = +++ =1(u, bh) 
ar /2 dé T 5 

J, Vu2cos?6+psin2?9 26 ©) 
which proves that I(a,b)M(a, b) = 7/2. Since there are other methods for 
proving this relation between J(a,b) and M(a, b) (see [2]), I(a, b) = I(a,, b,) 
becomes a consequence of the obvious identity M(a, b) = M(a,, b,). We can also 
study (5) from the point of view of the underlying elliptic curves. Let E, be 
defined by w* = (a2 + t*)(b? + t*). Then (4) implies that the period 7, of E,, is 
given by 7, = 2”7, so that 7, — «©. This means that in the moduli space )/SL(2, Z), 
the elliptic curves E,, are “converging” (the technical term is degenerating) to a 
rational curve. Thus the limit integral in (5) is an integral over a rational curve, 
which is why it is so easy. 

The integrals [(a, b) and J(a, b) of this problem have other interpretations as 
periods. In particular, (a, b) is a period of a curve (of genus 1) with equations 
u* = a*x* + b*y” and x* + y* = 1, while J(a, b) is a period of a curve (of genus 
3) with equations u* = (a*x* + b?y?)(b?x? + a*y”) and x* + y” = 1. The changes 
of variable in the integrals can then be explained in terms of functions between 
these curves. 
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Solved also by J. Anglesio (France), F. Bachmann (Switzerland), S.-J. Bang (Korea), R. Betts 
(student), K. V. Bhagwat (India), P. Bracken (Canada), W. A. Businger (Switzerland), R. J. Chapman 
(U.K.), Y. Diao, M. Dresevié & N. Cakié (Yugoslavia), Z. Guan & N. Passell, R. W. Hopper, D. 
Jespersen, I. Kastanas, P. Landweber, H. Lipman, N. J. Lord (U.K.), O. P. Lossers (The Netherlands), 
J. Melville (U.K.), G. Miller (student, Canada), A. Pechtl (Germany), C. E. Rieck Jr. & M. Q. Rieck, T. 
Schira (Germany), D. Trautman, R. L. Young, K. Zacharias (Germany), University of South Alabama 
Problem Group, and the proposer. 


An All-Ones Problem 


10197 [1992, 162]. Proposed by Uri Peled, University of Illinois, Chicago, IL. 


Light bulbs L,, L,,..., L, are controlled by switches S,,S,,...,5,. Switch S;, 
changes the on/off status of light L, and possibly the status of some other lights. 
Assume that if S; changes the status of light L,, then S$, changes the status of light 
L,. Initially all the lights are off. Prove that it is possible to operate the switches in 
such a way that all the lights are on. 


Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, The 
Netherlands. Define the matrix A as 


- | 1 if switch S, controls bulb L, 


0 otherwise. 


Then A is a symmetric (0, 1)-matrix with all-one diagonal, and it should be proved 
that the all-one vector belongs to the column space of A, when calculated mod- 
ulo 2. 

More generally we shall prove that for a symmetric binary matrix A the 
diagonal d belongs to the column space modulo 2, denoted Im A. In this form the 
problem occurs as “Problem 798,” Nieuw Archief voor Wiskunde, (4) 9 (1991), 
117-118. We give a different solution 


d € Im A is equivalent to (Im A)” ¢ (d)*. 


So let x € (Im A)+, ‘ie. DY, x,4,, = 0 for all j. 

Hence L7_,L7_,x;Aj,;x; = 0, which, by symmetry of A reduces to L7_,x7A;; = 
0. So £7_,x,d; = 0 since A,, = d, by definition and x? =x,. Thus x € (d)~ as 
required. 


Editorial comment. Most solvers used a matrix interpretation as above, but a 
few worked directly with a graph with incidence matrix A. The proposer, in 
consultation with N. Alon and L. Lovasz was able to trace this form of the result to 
an unpublished result of T. Gallai (see L. Lovasz, Combinatorial Problems and 
Exercises, North-Holland, 1979, Exercise 5.17). Other readers provided references 
to K. Sutner, “The o-game and cellular automata,” this MoNTHLy, 97 (1990), 
24-34; F. Galvin, “Solution to problem 88-8,” Mathematical Intelligencer 11 
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(1989), 31-32; and K. Sutner, “Linear cellular automata and the Garden-of-Eden,” 
Mathematical Intelligencer 11 (1989), 49-53 (especially theorem 3.2). 


Solved by 41 readers and the proposer. 
Products of Nilpotent Matrices 


10200 [1992, 163]. Proposed by Daniel Goffinet, St. Etienne, France. 

(a) Prove that a (square) matrix over a field F is singular if and only if it is a 
product of nilpotent matrices. 

(b) If F =C, prove that the number of nilpotent factors can be bounded 
independently of the size of the matrix. 


Solution by Richard Stong, University of California, Los Angeles, CA. We will 
show that for any field F four nilpotent factors suffice. Clearly a product of 
nilpotent matrices is singular. Hence we need only decompose a singular matrix 
into nilpotent factors. 


Lemma 1. If A is a square matrix over F, then A is a product of two matrices that 
can put into Jordan canonical form with eigenvalues in F. If A is singular, we may 
further assume that the Jordan canonical forms have a final row and a final column 
of zeroes. 


Proof: Passing to a different basis, we may assume A breaks up into blocks of the 
form 


0 1 0 0 
0 0 1 0 
B- : , 
0 0 0 1 
—by —b, —b, —b,-4 
where the characteristic polynomial p(t) = t’ + b._,t’~' +--+ +b, is a power of 


an irreducible polynomial. If b, # 0 (i.e., P(t) # t”), consider the identity 


—1 


1 O ve 0 0 
0 1 ree 0 0 
0 QO: 1 0 
Cy, ©2 Cr-1 Gy 
0 1 0 0 
0 0 1 0 
x 
0 0 1 
—C,b9 Cy — c,b, oss oss Cr — ¢,b,_, 


For any {c,} the first matrix can be put in Jordan canonical form since its 
characteristic polynomial is (t — 1)’~'(t — c,). The characteristic polynomial of 
the second factor is p’(t) = t’ + (c,b,_, —c,_,)t" ' + (c,b,_, —¢,_,)t"? 
+--+ +c,b). By choosing the c,; appropriately we may assume this polynomial is 
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(t — 1)’, hence splits over F. If A is singular, then we also get some blocks of the 
form 


0 1 O 0 
0 0 1 0 
N=]: 3 3 : 
0 0 O 1 
0 0 O 0 
For these use the identity 
1 0 0 0 O O 
—] 1 0 0 O O 
1 —] 1 0 O O 
N= 
(-10 (-1)"" (-1)"" -1 1 0 
0 0 0 0 0 
0 1 O 0 0 
0 1 1 0 0 
0 O 1 1 0 
Xyr oo: oo 
0 0 O 1 1 
0 0 O 0 1 


Note that both sides have 0 as an eigenvalue of multiplicity one. Hence both of 
their Jordan canonical forms have a final row and final column of zeroes. This 
shows that A is the product of two matrices that can be put into Jordan canonical 
form and if A is singular, then both have a final row and final column of zeroes. 
(In fact, if the null space of A is r-dimensional, we get the final r rows and final r 
columns all zero. Another interesting observation is that if F is infinite the proof 
above can be modified to show that both factors are diagonalizable.) After a 
change of basis both these factors have the form 


oo] 
0 OQ)’ 
where B is (n — r) X (n — r) upper triangular and 0 denotes a matrix of zeroes, 


rxXr, (n-r)xXr, or rX(n-—Y7r), as required. (In fact, B has only nonzero 
entries on and just above the diagonal.) The following lemma factors these. 


Lemma 2. Let A be ann Xn matrix of the form 

B 3 

0 OQ)’ 
where B is an (n — r) X (n — r) upper triangular (r > 1). Then A is a product of 
two nilpotent matrices. 


Proof: 


_{0 B 0 0 
A=({ qm 0} 


where J, _, denotes the (n — r) X (n — r) identity matrix and 0 denotes a matrix 
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of zeroes, either r X r,(n —r) Xr, or r X (n — r), as required. The first factor is 
strictly upper triangular the second strictly lower, hence both are nilpotent. 
Applying these two lemmas solves the problem. 


Editorial comment. Frank Schmidt and Pei Yuan Wu submitted references to 
Pei Yuan Wu, “Products of nilpotent matrices,” Linear Algebra Appl. 96 (1987), 
227-232. In this article, Wu proves that every singular complex matrix A is a 
product of two nilpotent matrices, except for the case where A is a 2 by 2 nonzero 
nilpotent matrix (in which case he shows that such an expression is never possible). 
Wu also provided a reference to T. J. Laffey, “Factorizations of integer matrices as 
products of idempotents and nilpotents,” Linear Algebra Appl. 120 (1989), 81-93. 
In the introduction to this article, Laffey asserts that Wu’s result could be 
extended to arbitrary fields. However, no solution giving a complete argument 
leading to fewer than four factors over a general field was received. 


Solved also by I. Kastanas, J. Sangroniz (Spain), T. Zeanah (part b only), and the proposer. 


Collaborating editors: David F. Appleyard, Paul T. Bateman, Bruce C. Berndt, 
Duane M. Broline, Barry W. Brunson, Frank S. Cater, Gulbank D. Chakerian, 
Underwood Dudley, Gerald A. Edgar, Michael A. Filaseta, Ira M. Gessel, Richard 
A. Gibbs, Jerrold R. Griggs, Douglas A. Hensley, John R. Isbell, Mourad E. H. 
Ismail, Murray Klamkin, Daniel J. Kleitman, Frederick W. Luttmann, Frank B. 
Miles, Richard Pfiefer, Stephen L. Portnoy, J. O. Shallit, John Henry Steelman, 
Kenneth B. Stolarsky, David E. Tepper, Douglas B. Tyler, Daniel Ullman, and 
William E. Watkins. 


Answer to Picture Puzzle 
(p. 748) 
No, they are not: they are Emile Borel and 
Armand Borel. 


Dr. Marston Morse, professor of mathematics at Har- 


vard University, has accepted a call to a professorship of 
mathematics at the Institute for Advanced Study at 
Princeton, New Jersey. The staff of the School of Mathe- 
matics now consists of the following members: Drs. Albert 
Einstein, Oswald Veblen, J. W. Alexander, John von 
Neumann, Herman Weyl and Marston Morse. 


42(1935), 124 


1993] PROBLEMS AND SOLUTIONS 809 


NAL OA 
NSS 


— 
=P 
Ss Ss 
— = 2 
Y = 
= ; freee . 
ES > 
— ; 
NA 
—_ 2 
23: 
@ uum = 
= : 
63 Y 
= = 
= 
S 
Y 
= 
= 
Wwe \ “eS ~ a> \ 
WR 


NOTICE TO AUTHORS 


The Monthly publishes articles, notes, and other fea- 
tures about mathematics and the profession. The 
readership of the Monthly is intended to include ev- 
erybody who is mathematically inclined, including of 
course professional mathematicians and students of 
mathematics at all collegiate levels. While no single 
article or feature is likely to appeal to everyone, mate- 
rial should interest and be accessible to a large num- 
ber of readers. This is the most important criterion for 
acceptance. 


Articles may be expositions of old results or presenta- 
tions of new ones. They may concern all of mathe- 
matics or one small area, a broad development or a 
single application, historical reminiscences or one 
important event. While some articles may contain the 
author’s new research, the novelty of material and 
generality of the results is far less important than the 
clarity of exposition and general interest. Discussing 
one illuminating case of a well known result is far 
better than providing all the details of an obscure but 
new proposition. Articles in the Monthly are sup- 
posed to inform and to entertain; they are meant to 
be read rather than archived. 


Notes are short and possibly informal articles. A note 
may concern a clever new proof of an old theorem, a 
novel way to present tired material, or a lively discus- 
sion of a philosophical (but still mathematical) issue. 
Also, any topic is suitable, so long as it is related to 


mathematics. Because a note is short, the first few. 


sentences are the most important part: They should 
explain the purpose and invite the reader in. Pho- 
tographs or diagrams often will attract the reader’s 
attention. 


All articles and notes should be sent to the editor: 


JOHN EWING, 

Department of Mathematics, 
Indiana University, 
Bloomington, IN 47405. 


Please send 3 copies, typewritten on only one side of 
the paper. Illustrations should be carefully drawn on 
separate sheets of paper in black ink; the original 
should be without lettering and two copies should 
have appropriate captions and lettering indicated. 


Proposed problems or solutions should be sent to: 


RICHARD BUMBY, 
P.O. Box 10971 
New Brunswick, NJ 08906-0971. 


Please send 2 copies of all material, typewritten if 
possible. 


Letters to the Editor, both for publication and for 
private reading, should be sent to the Editor at the 
address given above. Comments, including criti- 
cisms, are welcome, as are all suggestions for mak- 
ing the Monthly a lively, entertaining, and informative 
journal. 
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Thomas Archer Hirst— 
Mathematician Xtravagant 
V. London in the 1860s 


J. Helen Gardner and Robin J. Wilson 


After leaving the Academy I took my ticket for London by way of Dieppe and Newhaven... The 
passage was without exception the smoothest I ever made, the Channel was as quiescent as a 
duck-pond, the day beautiful and sunny... I was right glad to see the white cliffs of my native 
land and my eyes lingered gladly on the villages with their churches and on the farm-steads 
about which was an air of solid domestic comfort and prosperity which we look for in vain out of 
England. In short I felt a quiet pleasure in realising the fact that after long wanderings I was 
coming home at last and that sources of happiness were in store for me to which I had long been 
a stranger... 


After two years establishing his reputation in Europe, Hirst decided that it was 
time to return home. On arrival in London, in the summer of 1859, he took up 
lodgings near John Tyndall. 


9th October 1859: ... Indeed my London life commences well. I have John close to me, can run 
into his rooms half an hour every evening and finish off the day with pleasant useful conversa- 
tion. Never in my life was I better situated for getting through solid work and having the 
advantage of the best companionship. I trust the effects of all this will be visible by and bye... 


= a 
feat es 


(1814-1897) Arthur Cayley (1821-1895) 


James Joseph Sylvester 
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By now, Tyndall was firmly established in the London scientific scene, and he 
introduced Hirst into his circle of friends. This enabled Hirst to become ac- 
quainted with the major scientific figures of the day. But what might have been 
merely polite introductions often developed further. 


16th October 1859: On Monday having received a letter from [James Joseph] Sylvester I went to 
see him at the Athenaeum Club. We had an hour’s talk in the little waiting room. He talked 
continuously for that time about his partitions of numbers and strange to say he was less obscure 
than I expected. He was, moreover, excessively friendly, wished we lived together, asked me to 
go live with him at Woolwich and so forth. In short he was excentrically affectionate... 


Just before Christmas he called on Arthur Cayley, and spent a very interesting 
hour talking about Cayley’s work on curves of the third order and a new method 
for obtaining the squares of the differences of the roots of a quintic, and his own 
work on derived curves of double curvature. 


23rd December 1859: ... I explained what I was doing in which he expressed some interest. I 
was a little amused and encouraged too by his asking me for a definition of the rectifying plane. 
The great geometer had forgotten it for the moment. 

What a wonderful head he has, not merely round but spheroidal with the largest diameter 
parallel to his eyes, or rather to the line joining his ears. He never sits upright on his chair but 
with his posterior on the very edge he leans one elbow on the seat of the chair and throws the 
Other arm over the back. Yet he is a keen sighted and extraordinary man, gentle I think by 
nature and at once timid, modest and reticent. Often when he speaks he shuts his eyes and talks 
as if he were reading from an unseen book, and talks well too so that one has to sharpen one’s 
own wits to follow him. 


His reading was wide, ranging from Alfred Tennyson’s Idylls of the King to George 
Boole’s Differential Equations. In the latter he found a passage that seemed to be 
related to his work. 


29th January 1860: ... In the chapter on Partial Differential Equations p. 342 occurs this 
passage “Similar but more interesting applications may be drawn from the problem of the 
determination of equally attracting surfaces.” This shows he has read my Memoir, but my name 
is not mentioned; yet I think I am the only one who has considered the problem in question. 


Hirst also attended lectures of current importance, such as one given by his friend 
Thomas Huxley on Charles Darwin’s recently proposed theory of evolution. 


12th February 1860: ... On Friday evening I heard Huxley’s lecture on the Origin of Species at 
the Royal Institution. He gave us a noble peroration which is the part I shall remember 
longest... Tyndall introduced me to Babbage with whom we walked part of the way home. 


Employment for a mathematician was as difficult to obtain as it had been seven 
years earlier when he returned from Berlin—but again, Hirst fell on his feet, being 
offered the post of mathematics teacher at the University College School. This was 
initially a temporary post, which Hirst accepted gladly. The headmaster was the 
distinguished classical scholar and mathematician Thomas Hewitt Key. 


4th March 1860: ... I was introduced by Key to my class on Wednesday morning at 9.15, and 
have continued to attend ever since. I am occupied there from 9.15 A.M. to 3 P.M. with an 
interval of an hour and a half at noon. My salary is £1 per day. For a school the instruction is of 
a superior kind. The highest class is engaged with the 6th book of Euclid, the Binomial Theorem 
in Algebra, De Moivre’s theorem in Trigonometry and the simple machine in mechanics. So far I 
have succeeded quite well. I have merely been learning their powers. 
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University College School 
Founded in 1833, this school was built on the site of University College, in Gower Street, 
London, where it remained until moving to its present location in Hampstead in 1907. 


* 


... 1 rise every morning now at 7, breakfast at 7.30, light my pipe at 8 and smoke and attend 
to other necessary matters connected with health until 8.30, then walk down to Regent Street 
where I take the Islington omnibus which puts me down at the end of Gower Street within a few 
minutes walk of the College. At noon I get a chop and glass of sherry where I can and return 
soon after 3.P.M. by the omnibus pretty well tired. Promising as my position is I should hesitate 
to accept it as a permanency. The consideration of £1 per day would not induce me to neglect 
my dear “derived surfaces”. 


Not long after this, Hirst found himself drawn towards a rather different type of 
activity. 


3rd June 1860: ... a week ago (Friday week) I became enrolled in the Volunteer Guards (6 feet 
men). I paid one guinea entrance and one guinea subscription in advance. I have been twice to 
drill once at the house of our Sergeant and Secretary Mr Halse who lives quite near, and once in 
undress uniform at Hungerford Hall. The uniform is exceedingly conspicuous, a red tunic with 
black belt and shoulder strap, a black patent-leather helmet (Prussian shape) with black plume 
and black trousers with red stripe. The undress is a red flannel jacket. I have joined them chiefly 
for the sake of the drill which I hope will be beneficial but I am also quite prepared to accept all 
the consequences and in case of need to defend my country with my life. As far as physique is 
concerned the Guards are a fine body of men about 70 in number, the uniform is expensive and 
consequently the members are all gentlemen. 


In the meantime, Hirst continued, albeit slowly, with his translations and lecturing. 
His investigations had ground to a standstill, but even so, on March 1861, at the 
age of only 30, Hirst was nominated for a Fellowship of the Royal Society. His 
certificate was signed by (among others) Boole and Sylvester. There were many 
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other candidates, so he was not hopeful. 


24th March 1861: ... the Royal Society were discussing my merits. I have two powerful 
competitors [Henry] Smith of Oxford and [James Clark] Maxwell of King’s College; unless all 
three can be admitted I must expect to be the excluded one... Yesterday I was at an evening 
party at Dr. Carpenter’s and was introduced to Helmholtz and Maxwell with both of whom I had 
long conversations. The former is a little reserved, the latter talkative with a Scotch brogue, he 
took great interest in my ripples about which we spoke for some time... 
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The Royal Society of London _ 

The Royal Society was founded in 1662 by King Charles II. In 1863 it moved into these new 
rooms in Burlington House, Piccadilly. It is now located in Carlton House Terrace, near St. 
James’s Park. 


Figure 3. 


Unfortunately, his health was causing him problems. Even after an Easter vaca- 
tion, he felt exceedingly weak and spiritless. Although he suffered from no 
particular ailment, he had no energy for either his schoolwork or his researches. 


14th April 1861: ... I pay the greatest attention to diet and avoid smoking all day. But instead of 
being better for such abstinence I feel weaker. I must persevere however for although the two 
pipes I still allow myself do me good at the time I have a firm belief that for my complaint, 
indigestion, smoking must be injurious or at least cannot be beneficial. Sooner or later therefore 
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abstinence must tell upon my health. Who knows how much of my present debility is due to the 
habit (of 11 or 14 years standing) of smoking five or six times a day. To cut down smoking to two 
pipes a day has been one of the greatest trials I have gone through, but perseverance diminishes 
the trial. 


A week later, however, he learned from Tyndall that the Council of the Royal 
Society had placed his name upon the list of candidates to be elected Fellows in 
June. There were about 45 candidates, only 15 of which were chosen. 


21st April'1861: ... Next morning I received the following note from Cayley: 
Dear Sir,—I have much pleasure in being able to inform you of your name being on the Council 
list for the next election of Fellows of the Royal Society. 

Believe me yours very sincerely 


A. Cayley 


. Of course the news was very welcome to me and in reply to Cayley I assured him that the 
honour would always be enhanced to me by the thought that his name was amongst those of the 
Council who had lent me so generous a support. 


In March 1862 he recorded that he had been unable to get through his teaching 
work at the School. Extreme flatulence had produced giddiness which totally 
prevented him from standing at the board, and a strange numbness crept over his 
right arm and leg. It transpired that he had become very ill with dyspepsia, from 
which he was to suffer for over a month. Happily, he was soon back enjoying the 
company of his friends. 


4th May 1862: ... Yesterday, Saturday, Cayley, Sylvester and Harley dined with me. Tyndall was 
not present. It was without question to me the most interesting dinner party I ever gave and I 
believe one of the most successful at least all appeared to enjoy themselves. I contrived to give 
my three guests opportunities of communicating their latest results. Cayley explained his late 
controversy with Boole on a question of Probabilities. Sylvester was eloquent on the subject of 
Reseaux which has now complete possession of him. According to his own confession he is so 
excited about it that he cannot trust his own critical judgment and has to call Smith of Oxford to 
his assistance. Harley entered into a few particulars on his Differential Resolvents of Algebraic 
Equations and I communicated my results on Derived Surfaces which Sylvester pronounced to 
be at once interesting and ‘wonderful’. At 9.30 P.M. we all adjourned (in a cab) to Sabine’s 
Soirée at Burlington House... 


He was now making good progress with his investigations. Shortly afterwards, he 
met Augustus De Morgan and told him of his researches, but received a very cool 
reception. 


15th June 1862: ... He had no better remark to make than ‘How did you come across that 
problem?’ There are such an immense variety of similar questions. It was a kind of pooh pooh in 
fact. I felt angry with myself at having taken him even so much into my confidence. I ought to 
have felt that interest would not be reciprocal. A dry dogmatic pedant I fear is Mr. de Morgan 
notwithstanding his unquestioned ability... 


One of the most important scientific events of 1862 was the meeting of the British 
Association held at Cambridge, where he made several new acquaintances and 
presented a short communication on pedal curves which was ‘listened to with 
attention but created no discussion’. 


4th October 1862: ... I was much pleased with Boole... Immediately after breakfast I stepped 
up to him and introduced myself. The same day we sat together at the Hall dinner and had some 
pleasant chat. Evidently an earnest able and at the same time a genial man. 
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He was, however, rather less impressed when he subsequently came across the 
distinguished physicist William Thomson, later Lord Kelvin. 


7th June 1863: ... I have attended Thomson’s two lectures at the Royal Institution on the 
Electric Telegraph. More random unsatisfactory lectures I never listened to. 


15th June 1863: ... On Tuesday last I was at an “at home” given by Dr. and Mrs. King, the 
parents of one of my pupils and moreover relations of Prof. W. Thomson of Glasgow. It was the 
first time I had been introduced to Thomson. I cannot say that we suited one another very well 
or exchanged many words. He was civil and spoke flatteringly of my papers. 


During the late summer, his health began to deteriorate again, making it difficult 
for him to transfer his thoughts to his researches, and he paid an extended visit to 
France, Switzerland, Germany, Italy, and Norway. His journal records his sadness 
at the death of Steiner, as well as meetings with both old and new acquaintances. 
While in Germany, he attended a gathering of the Naturforschende Gesellschaft, 
where he met Rudolf Clausius, whose memoirs he had earlier translated. 


1st September 1863: ... I seized Clausius and he introduced me to Dedekind, a modest able 
mathematician Prof. at the Polytechnicum in Braunschweig. After dinner which was enlivened by 
numerous toasts, Clausius, Dedekind and I took our seats in a vehicle for the Excursion to the 
Morteratsch Glacier and a very pleasant excursion it was... 


The following summer, after completing his year’s teaching, he set off on what was 
becoming an annual visit to the Continent. While in Paris, he dined with Chasles: 


16th May 1864: ... The places of honour were given to Tchebichef whose acquaintance I 
renewed, for years ago I met him at Dirichlet’s... On Wednesday Tchebichef called on me and 
left me some of his papers. He is evidently a good natured man, he has a stuttering way of 
speaking French and is lame. 

... On Friday I took the American Railway to Sevres and sought the Maison Penel where 
Bertrand is at present residing. He had invited me to dinner... Tchebichef and myself were 
again the honoured guests on the right and left of Mrs Bertrand, Bertrand himself being 
opposite. I had much more conversation with Bertrand than I ever had before. I remember I had 
once a little prejudice against him. His manner I thought a little pretentious and forbidding. I 
begin to find that this is merely external, the man is kind at heart, extremely clever and full of 
ésprit... 


He particularly enjoyed spending a month in Bologna, where he renewed his 
acquaintance with Luigi Cremona and attended one of Cremona’s lectures. 


5th June 1864: ... He had a class of about 12 and lectured on the Theory of the Sun’s Dial in 
connection with his Descriptive Geometry. He is evidently a good lecturer; everything was 
explained with perfect clearness. One peculiarity of the lecture arrangements was that instead of 
a black board on the side of the room the top of the table before the professor was of slate and 
on it he wrote and made figures in chalk. The figures were of course inverted to the audience. 


In July 1864, he resigned his post at University College School in order to devote 
more time to research, and in November, he and Tyndall, along with other close 
friends, formed themselves into a select scientific club. 


6th November 1864: ... On Thursday evening Nov. 3, an event, probably of some importance, 
occurred at the St George’s Hotel, Albemarle Street. A new club was formed of eight members: 
viz: Tyndall, Hooker, Huxley, Busk, Frankland, Spencer, Lubbock and myself. Besides personal 
friendship, the bond that united us was devotion to science, pure and free, untrammelled by 
religious dogmas. Amongst ourselves there is perfect outspokenness, and no doubt opportunities 
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will arise when concerted action on our part may be of service. The first meeting was very 
pleasant and ‘“‘jolly.”... There is no knowing into what this club, which counts amongst its 
members some of the best workers of the day, may grow, and therefore I record its foundation. 
Huxley in his fun christened it the ‘““Blasto dermic Club” and it may possibly retain the name. 


The “jolly” time they had was obviously fruitful, for the X-club, as it became, was 
to influence the organization and image of English science for the next twenty 
years. They all acquired nicknames such as the Xcentric Tyndall, the Xalted 
Huxley and the Xtravagant Hirst. 


Three members of the X-club 
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The Xcentric Tyndall The Xalted Huxley The Xtravagant Hirst 


In November 1864, Hirst was elected to the Council of the Royal Society for the 
first time. The very next week saw the first meeting of what was later to become 
the London Mathematical Society. Hirst became its first Vice-President, and was a 
member of the Council for almost two decades, becoming its treasurer, and later 
its President. 


13th November 1864: ... On Monday last I attended the first meeting of the Mathematical 
Society at University College. De Morgan gave an address, which I seconded. I was put upon the 
Committee. I had at first declined but at De Morgan’s request allowed my name to stand. 


25th June 1865: ... On Monday I attended the Math. Soc. and proposed Cayley, Sylvester, 
Spottiswoode and Green as members. Sylvester gave us a capital communication on Newton’s 
rule for the discovery of the imaginary roots of an equation. 
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On 18th August 1865, he received from the Secretary of University College the 
news of his appointment as Professor of Mathematical Physics. He had good 
reason to feel pleased with himself. His appointment to this newly-created chair 


established him as one of only seven physics professors in the country. 


28th August 1865: ... Thus I have reached another step in my career. I have waited long for it 
and sacrificed much in order to stop in London. I trust I may have health and strength to 
perform my new duties efficiently. 


15th October 1865: On Tuesday morning at 9 my work commenced with a lecture to 25 or 26 
students. It passed off well and was listened to with the greatest interest. On Wednesday 
morning I commenced with my senior class, there were 5 students and a visitor... I have since 
continued my work every morning and have now altogether about 32 students which represent 
an income of 162 pounds upon which therefore I shall just be able to live without seeking for 


extra work. 


This point marks the pinnacle of his career. As a Council member of the Royal 
Society, a member of the X-club, and Vice-President of the London Mathematical 
Society, he had become a most important member of the Scientific Establishment 
in London. Regrettably, his fortunes were soon to decline, as we shall see in the 


final article. 
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Review 


Infinitesimal Calculus. By F. S. Carey. 
(Longmans Mathematical Series.) London, 
Longmans, 1919. 8 vo. 20 + 352 + 9 pages. 
Price 14 shillings. 


The symbolism referred to for range 
and sequence is simple and worthy of 
mention. An open range from a to b is 


denoted by brackets [a, b], a closed range 
by parentheses (a, b); and a range open 
only at one end by the appropriate combi- 
nation of the bracket and parenthesis 
symbols; thus a range open at a and 
closed at b is denoted by [a, b). 


—American Mathematical Monthly 
27, (1920) p. 470-471 
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From the Post-Markov Theorem 
Through Decision Problems 
to Public-Key Cryptography 


Iris Lee Anshel and Michael Anshel 


Dedicated to the Legacy of Emil Post 


1. INTRODUCTION. On November 3—4 1988 a conference to commemorate the 
life and legacy of Emil Post (1897-1954), in anticipation of his one-hundredth 
birthday, was held at the City College of New York. Emil Post graduated from the 
City College in 1917 and received a Ph.D. from Columbia University in 1920. A 
postdoctoral fellowship at Princeton University was followed by long years of 
teaching in the public school system. He returned to the City College in 1935 asa 
member of its Mathematics Department where he resided for the remainder of his 
academic career. In the process Post was transformed from a brilliant young 
researcher into a great teacher and visionary intellectual. Four decades after their 
initial contacts with Post his former students spoke of him with a reverence that is 
rarely encountered in university life. The scholarly aspects of his commemorative 
meeting dealt with a wide range of Post’s contribution to mathematics, logic and 
computer science. In this paper we should like to briefly recount his profound 
influence on the theory of algorithmic decision problems and the connections 
between this active field of research and current methods in public-key cryptogra- 
phy. We conclude our discussion by posing an historical question concerning the 
relationship between Post and the cryptologists of his day, the answer to which 
may shed new light on his legacy in the shadowy world of secret intelligence. 


2. STRING REWRITING, THUE SYSTEMS, AND PRESENTATIONS. In this 
section we briefly review the basic concepts of string rewriting, Thue systems, and 
presentations. With this language in place we will be in a position to discuss some 
of the classical decision problems with which Post was concerned. 

We begin by motivating this discussion with an historical example, a Caesar 
cipher. Identify the letters of the English alphabet {A, B,..., Z} with the symbols 
{do, 4j,..., 4s}. Consider the set of pairs 


a,,a;)|i - 1 = j mod 26}. 
{(4;, a;) } 


These pairs define a method of encrypting plaintext messages: for example ‘IBM’ 
(aga,a,,) becomes ‘HAL’ (a,a,)a,,) when a, appearing in the plaintext string is 
replaced by a;_, to obtain the ciphertext. 

The idea behind the above example is to consider an alphabet together with a 
set of replacement or rewriting rules. With this in mind we begin our formal 
development. Let A denote a set of symbols (which we shall refer to as an 
alphabet). Consider FM(A) the free monoid based on A. The elements of FM(A) 
are the finite sequences of symbols or words from the alphabet A. Equality in 
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FM(A) will be denoted by = ; that is given words u and v in FM(A), u =v if 
and only if they denote exactly the same string. Multiplication in FM(CA) of the 
words u,v is simply given by the concatenation uv, and the empty word e serves as 
the identity in FM(A). 

A rewriting system RW on A consists of the pair (A, P) where 


P C FM(A) X FM(A). 


A derivation with respect to RW is a finite sequence of words in FM(A), 


Wig eyW,, 
such that either n = 1 or for each i = 1,...,m — 1 there exists 
x;,¥; © FM(A) 
and 
(u;,U;) © P 
such that the equations 
W, = XU; Y; 


and 
Wi41 = Xi; 


hold. We shall refer to the pair (u,;, v;) as a rewriting rule or production, and term 
w,, derivable from wy. 
Returning to the Caesar cipher rewriting system described above, the process of 
enciphering ‘IBM’ by ‘HAL’ is given by the following derivation: 
AgQ1Q17, 4741017, 4749417, 474 9a)). 


In his 1947 paper on the algorithmic unsolvability of a problem of Thue, Post 
introduced a class of combinatorial systems which he called systems of semi-Thue 
type [21]. From our perspective these are rewriting systems with finite alphabet and 
finitely many rewrite rules, together with a specified initial word. The semi-Thue 
systems serve as a convenient tool to represent the computation of a Turing 
machine. An exposition of this methodology is given in a now classic text on 
computability and unsolvability by Martin Davis [4], a student of Post and an 
authority on his life and work. 

A Thue system T is a rewriting system such that the set P of productions is a 
symmetric relation on FM(A): if (u,v) € P then (v, u) € P. The imposition of this 
symmetry condition insures that the process of deriving (or rewriting ) the word w, 
from w, as above is reversible. Fixing our Thue system we now consider the 
equivalence relation P* on FM(A) generated by the set of productions P (by 
definition P* is the intersection of those equivalence relations on FM(A) which 
contain P). From the definition of P*, it follows that two words FM(A) are 
equivalent provided one is derivable from the other. Moreover P* is a congruence 
on FM(A). The semi-group M(T) specified by the Thue system T is thus 
isomorphic to the factor monoid 


M(T) = FM(A)/P*. 


Specifying each production (u,v) and its reverse by the equation u = v we obtain 
the traditional monoid presentation for M(T), 


(A;u =v((u,v) €P)). (2.1) 


Conversely, it is not difficult to show that given an arbitrary monoid M there exists 
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some Thue system 7 such that 
M=M(T). 


In the case A is finite we say that T and M(T) are finitely generated, in the case P 
is finite we say that T and M(T) are finitely related, and the Thue system 7 and 
the monoid M(T) are said to be finitely presented if they are both finitely 
generated and finitely related. We will be concerned with finitely presented Thue 
systems 7 and we shall denote its associated monoid presentation by 


(A) +++5 An} Uy = U4y,.-.,Uy = Vz). (2.2) 


A group alphabet is an alphabet A partitioned into two disjoint subsets, A(+), 
the positive symbols, A(—), the inverse symbols together with an idempotent 
permutation inv of A such that 


inv: A(+) > A(-). 


We write a~* = inv(a) for a in A and note (a~!)~! = a. The notation extends to 
the words over the group alphabet by taking e~' = e, and for z = b,... b, setting 
z= b;*... by where the b, are positive or inverse symbols. A word z is said to 
be freely reduced provided it is not of the form z = xaa~'y or xa ‘ay for any a in 
A(+). A monoid presentation is said to be a group presentation provided it is 
specified by a Thue system over a group alphabet whose rewrite rules satisfy the 
following conditions: 

(i) All rewriting rules of the form aa~! = e and a~‘a = e for a in A are contained 
in the system, and we call these rules trivial relators. 

(ii) All other rewriting rules or non-trivial relators occur in pairs of the form u = e, 
u-' =e where u,u' are free reduced words. 

The collection of all rewriting rules is called the defining relators. 

In general, the trivial defining relators are suppressed when specifying the 
group, as are the companions to each non-trivial relator u = e. In addition we list 
only the positive symbols. A finitely presented group G is thus specified and 
denoted by 


—1 


—1 


(Ay, +++) G,3 Uy = e,...,U, =e). 


For an example of a presentation we consider F,,, the free group of rank n, which 
by definition has no non-trivial relators. We see that F, is specified by the group 
presentation 


oe 


Every group is a factor group of a free group and may be specified up to 
isomorphism by a group presentation. 


3. ALGORITHMIC DECISION PROBLEMS. By a decision problem we mean one 
whose instances require a yes/no answer. A decision problem is said to have an 
algorithmic solution if it is possible to program a digital computer to correctly 
supply the yes/no responses. If this is not possible we say that the problem is 
algorithmically unsolvable. If there is such a program and the running time of the 
program is bounded by a polynomial in the symbolic size of the input then we say 
the solution is efficiently constructible. 

The algorithmic concepts referred to above are very natural to our computer- 
ized society. This was not the case in 1936 when Post [19] as well as Turing [24] 
both formulated a basis for these concepts by specifying idealized computing 
machines. In Turing’s exposition a universal computer capable of executing any 
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algorithm is constructed. The halting (or stopping) problem for this machine is 
then shown to be algorithmically unsolvable. Post [21] clarified the construction of 
Turing’s machine and applied his methods in order to resolve a problem of Thue 
[23] which we will discuss in §4. 

Perhaps the most widely discussed decision problem of the twentieth century is 
Hilbert’s Tenth Problem. This problem asks for an algorithmic solution for 
determining whether or not an integral polynomial equation has integer solutions. 
Post believed that Hilbert’s Tenth Problem was algorithmically unsolvable [20]. 
The force of this belief was conveyed to Martin Davis who, along with Hilary 
Putnam and Julia Robinson, provided the basis for Yuri Matiyasevich’s proof of its 
algorithmic unsolvability [14]. Martin Davis was cited by Marvin Minsky at his 
address to the City College conference as directing his (Minsky’s) attention to 
Post’s problem of tag. Minsky went on to show that this problem was algorithmi- 
cally unsolvable [17]. Researchers such as Davis and Minsky have enabled Post’s 
ideas to be transferred and transformed by succeeding generations. This process 
has made possible the enormous advances in computer science that have allowed it 
to emerge as an academic discipline. 


4. THE POST-MARKOV THEOREM AND SUBSEQUENT DEVELOPMENTS. 
We now restrict our attention to finitely presented monoids and groups. In the 
following discussion, we fix a presentation for the monoid or group in question. 
The word problem for a finitely presented monoid (resp. group) is to decide for 
arbitrary words, w, z in the alphabet (resp. group alphabet) whether or not the 
words are congruent via the associated Thue system (resp. group presentation). 

Thue’s word problem for finitely presented monoids appears in a 1914 paper of 
A. Thue [23] while the word problem for groups is formulated in the course of a 
topological investigation in 1911 by M. Dehn (see [3]). Another problem formu- 
lated by Dehn (see [3]) was the conjugacy problem: given arbitrary words w, z from 
a finite presentation of a group, decide if there is a word x such that w and x7 ‘zx 
are congruent via the presentation (and thus define conjugate elements in the 
associated group). It is not difficult to prove that the algorithmic solvability of both 
the word and conjugacy problems is independent of the fixed finite presentation. 

The negative resolution of the above problems represents an important achieve- 
ment of twentieth century mathematics and one in which both Post and A. A. 
Markov played a fundamental role. In 1947 Post [21] and slightly later Markov [13] 
published independent proofs of the algorithmic unsolvability of the word problem 
for finitely presented Thue systems. The version of this result stated below reflects 
contemporary concern with constructive computational methods. 


Post-Markov Theorem. There exists finitely presented Thue systems having algorith- 
mically unsolvable word problem. Moreover, there is an efficient algorithm P which, 
upon input of any Turing machine J (resp. normal algorithm &Z) will output a 
finitely presented Thue system P(T ) (resp. P(.)), such that P(Z ) (resp. P()) 
has an algorithmically unsolvable word problem if Z (resp. &) has an algorithmi- 
cally unsolvable halting problem. 


A brief survey of finitely presented Thue systems with algorithmically unsolv- 
able word problem is given by Matiyasevich [14] together with a proof of the 
Post-Markov Theorem employing normal algorithms. The rewriting techniques 
developed by Post to represent the computation of a Turing machine find their 


838 POST-MARKOV THEOREM THROUGH DECISION PROBLEMS [November 


way into several textbook proofs of the Novikov-Boone Theorem for the word 
problem in group theory. 


Novikov-Boone Theorem. There exists finitely presented groups having algorithmi- 
cally unsolvable word problem. Moreover, there is an efficient algorithm B which 
upon input of any finitely presented Thue system T will output a finitely presented 
group (B)T such that B(T) has algorithmically unsolvable word problem if T has 
algorithmically unsolvable word problem. 


Examples of presentations of groups with algorithmically unsolvable word 
problem may be obtained using techniques studied by J. L. Britton (see [3]), but at 
the present time these presentations are quite complicated and involve many 
defining relators. The situation for finitely presented semigroups is however quite 
different. The semigroup 


S = (a,b,c,d,elac = ca, ad = da, bc = cb, bd = db, eca 
= ce, edb = de, c’a = cae) 


while seeming simple in form has been shown by G. C. Tzeitlin to have an 
algorithmically unsolvable word problem (see Lallement [11] for an accessible 
proof). It is such striking examples that demonstrate the subtlety of the word 
problem. 

The authors were surprised to discover a relationship between Thue’s word 
problem and Dehn’s conjugacy problem. A finitely presented commutative Thue 
system is one whose rewriting rules include (ab, ba) for all distinct a,b in its 
alphabet. Its associated semigroup is commutative and its word problem is algo- 
rithmically solvable. In M. Anshel [2] these properties are explicitly employed to 
show that the conjugacy problem for a special class of finitely presented groups is 
algorithmically solvable. 

Results of a positive nature can be obtained for many large classes of groups 
and to give a perspective we highlight a few. A group G is termed residually finite 
provided when given g © G, g # 1, there is a normal subgroup N, < G such that 
g is not contained in N,. Equivalently a group is residually finite if the intersection 
of the subgroups of finite index is the identity. The word problem for finitely 
presented residually finite groups is algorithmically solvable since both the words 
defining the identity element in G and the words defining the nonidentity 
elements in G are recursively enumerable (see [3]). 

An important class of groups for which the word problem can be decided is the 
class of finitely generated groups with a single defining relator (e.g. one relator 
groups). Dehn originally formulated the word problem in the course of his 
investigation of the fundamental groups of orientable two dimensional manifolds 
(which are one relator groups). He did solve the word problem for these groups 
with the algorithm that has come to be known as Dehn’s algorithm. Dehn’s 
algorithm is studied geometrically in the context of small cancellation groups and 
more recently has surfaced in the study of hyperbolic groups initiated by Gromov 
(see [3]). The complexity of Dehn’s algorithm is studied by B. Domanski and 
M. Anshel in [6] where it is shown that a finitely presented group of Dehn’s 
algorithm has word problem solvable in linear time on a deterministic multitape 
Turing machine. W. Magnus (a student of Dehn) studied the entire class of one 
relator groups and proved through entirely algebraic means one of the landmark 
theorems in combinatorial group theory: the word problem for finitely generated 
one relator groups is algorithmically solvable (see [3]). More recently I. Anshel [1] 
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has investigated a class of groups with two relators and at least three generators 
and again the word problem is seen to be algorithmically solvable. The analysis 
here is close in spirit to that Magnus employed with the addition of methods from 
the theory of groups acting on graphs (see [3] for an introduction to these 
methods). To get some idea of the phenomenon that can arise when looking at this 
problem the reader is invited to consider the group E given by the presentation 


(a, bla~'b2a = b?, b~'a*b = b°) 


and show every word in the generators defines the identity element (i.e. the group 
is trivial). 

Recall that the conjugacy problem requires an algorithm to decide, given two 
elements in a group, whether or not they are conjugate. A striking result is proved 
by Charles F. Miller III in [16] regarding the conjugacy problem and is very much 
in the spirit of Post and of Miller’s mentor W. W. Boone. 


Miller’s Theorem. There exists finitely presented residually finite groups with algo- 
rithmically unsolvable conjugacy problem. Moreover, there is an efficient algorithm C 
which upon input of a finitely presented group G will output a finitely presented 
residually finite group C(G), such that C(G) has algorithmically unsolvable conju- 
gacy problem if G has algorithmically unsolvable word problem. 


5. PUBLIC-KEY CRYPTOSYSTEMS BASED ON THE WORD AND CONJU- 
GACY PROBLEMS. In conventional cryptography a method is provided to a 
sender S and a receiver R to transmit messages over an insecure channel by a 
mechanism such as a code book which provides easy encoding and decoding 
facilities. A particular weakness of such cryptosystems is that an interceptor with 
knowledge of the encoding facilities can readily decode transmitted messages. 
Conventional cryptography underwent automation at the end of World War I 
resulting in the development of mechanical cipher machines. These machines 
required the possession by both the sender S and the receiver R of a single key k 
in order to encode and decode a transmitted message. Cryptoanalytic machine 
attacks (i.e. code-breaking) on one class of cipher machines, the Enigma, provided 
critical information to the Allies during World War II (see [8]-[10]). 

One response to the advances in codebreaking technology was the introduction 
of public-key cryptosystems. This allows an R to receive messages from many 
senders S,,5,,...,5, without the introduction of numerous codebooks or keys. 
These are replaced by a public-key mechanism which enables any sender S;, to 
easily encode messages which may be then transmitted over insecure channels. The 
code is designed so that if some third party T intercepts a message, 7 will find it 
computationally infeasible to break the code even with knowledge of the encoding 
mechanism employed by S, unless the receiver’s private key is known to T. 

One widely discussed system is the RSA public-key cryptosystem named after its 
inventors R. L. Rivest, A. Shamir and L. Adelman (see [22]). Its security is 
generally thought to depend on the intractability of factoring large integers. One 
weakness, common to all public-key cryptosystems is that once the system is 
specified cryptoanalytic attacks may be initiated. Two such attacks on the RSA 
system are outlined in [15]. 

Another more ambitious public-key cryptosystem based on the algorithmically 
unsolvability of the word problem was suggested by N. R. Wagner and M. R. 
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Magyarik (see [25]). Begin with a finitely presented group G specified by, 
(a,,...,4,|u, =1,u,=1,...,u, = 1 


with algorithmically unsolvable word problem, together with a secret homomor- 
phism 


h:GoA 


to a finitely presented group A with efficiently solvable word problem (such as a 
large finite group or finitely presented group of Dehn’s algorithm). The homomor- 
phism h is specified by its values on the finite generating set of G. Employing a 
consequence of Von Dyck’s theorem [3], one can verify that 4 is a homomorphism 
by demonstrating that the image of each defining relator of G is the identity in the 
group A (note that this can be verified since A has efficiently solvable word 
problem). For this scheme we require two elements of G given respectively by 
words, yo, y, such that 


AC Yo) # h(y1). 


The mechanism for encoding is simply to replace each transmission of a ‘0’ bit by 
any word y, where 


Yo = Yo ModG 


and similarly a ‘1’ bit is replaced by any word yj, such that y, = y, modG. Thus 
for example the sequence 0, 1, 1,0 becomes 
Yo Vis Vi» Yo 

where y; is obtained from y; by successively inserting and/or deleting the 
defining relators of G. In this scheme the group G and the words yo, y, constitute 
the public-key, while the homomorphism h constitutes the private-key. An inter- 
ceptor 7 is faced with solving the word problem for G since R need never reveal 
the homomorphism h:G— A. Although this system, like any other may be 
attacked, it is not based on such a fragile mechanism as the intractability of 
factorization within the current technology. 

As a homage to Post we propose a public-key cryptosystem based on Miller’s 
Theorem for constructing finitely presented residually finite groups with algorith- 
mically unsolvable conjugacy problem (this extends the work of N. R. Wagner and 
M. R. Magyarik). The proposed cryptosystem begins with a finitely presented 
residually finite group G specified by, 


(A1,+++5 An) U, = e,...,U, =e) (5.1) 


with algorithmically unsolvable conjugacy problem (such a group’s existence is 
insured by Miller’s theorem, see §4). The additional data required for this system 
are two elements of G,{w, z} such that 


w#l 
and 
z=1 


in G. Since G was chosen to be residually finite there exists a finite image of 
G,G/N,, such that 


w EN, 
Thus when we consider the homomorphism 
h:G >G/N, 
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we may assert that 
h(w) #1 
h(z) = 1. 


Hence we conclude that w,z and h(w), h(z) are non-conjugate pairs in G and 
G/N,,, respectively. We keep the homomorphism / secret and assume that 
computation in the finite group is efficient enough to determine when two 
elements specified by words are conjugate or whether a word defines the identity 
in G/N,. The mechanism for recoding now follows that of the word problem 
cryptosystem described above with the enhancement of conjugation by words in G 
(as well as insertions and deletions of defining relators) being allowed. We observe 
that the complexity of the word problem for finitely presented residually finite 
groups is unknown at the time of this writing. In the Kourovka Notebook ({12] 
p. 58), F. B. Cannonito asks: 


Do there exist finitely presented residually finite groups with recursive, but not 
primitive recursive, solution of the word problem? 


As with the RSA cryptosystem, the conjugacy problem cryptosystem is based on 
the computational complexity of a special problem. To date, the research on 
integer factorization is massive [18] as compared to the research on the word 
problem for residually finite groups. In fact very little is known regarding the 
computational complexity of the word problem for these. groups as the above 
recursion-theoretic problem indicates. 


6. POST’S RELATION TO THE CRYPTOLOGY AND CRYPTOLOGISTS OF 
HIS ERA. We conclude our discussion by posing the following historical question: 


What impact did Post have on the cryptologists of his era? 


This question arises from two distinct sources. The first source is the very strong 
connection between the development of both the theory and practice of digital 
computation and cryptology and the second concerns Post’s contemporaries at the 
City College of New York, an institution where Post spent nearly his entire adult 
life. 

The intertwining of computation and cryptology is quite explicit in the lives of 
two individuals, Charles Babbage (1791-1871) and Alan M. Turing (1912-1954). 
Babbage was a prominent British mathematician whose Difference Engines and 
Analytic Engines were forerunners of the modern digital computer. He was also 
prominent among the cryptologists of his era for successful cryptoanalytic attacks 
on polyalphabetic ciphers (see [7]). Turing, a British contemporary of Post, played 
an instrumental role in the Allied victory in World War II by employing computing 
machines to break the Enigma code. Turing’s life and work are documented in [9]. 
Post was certainly aware and indeed employed Turing’s work with regard to the 
algorithmic unsolvability of the halting problem. It is only natural to ask whether 
there was a reciprocal interest in Post’s work on the part of Turing (from the 
perspective of cryptology). 

It is pointed out in [7] that a contemporary of Post, Charles J. Mendelsohn and 
faculty member of the History Department at the City College was very much 
involved in cryptological pursuits. In 1918 Mendelsohn was made a Captain in the 
Military Intelligence Division of the General Staff of the U.S. Army in charge of 
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decipherment of German codes. Mendelsohn was a classics scholar as well as 
historian and he pursued a lifelong study of historical ciphers and their originators 
including a study of Vigenére which appeared in 1940, the year following his death. 
In fact the proofs of this paper were corrected by his associate and friend, Lt. Coll. 
William F. Friedman, the Prinicpal Cryptoanalyst in the Office of the Chief Signals 
Officer of the U.S. Army. It was the same Friedman who rebuilt the U.S. 
cryptoanalytic capability during the 1930’s by hiring for the Signals Intelligence 
Service, Abraham Sinkov and Solomon Kullbeck (both City College graduates and 
both to go on to doctorates in mathematics and productive cryptological research 
for the National Security Agency). Friedman is regarded as one of America’s top 
codebreakers in that his work lead to the Japanese defeat at Midway during WWII 
(see [10)). 

After discussions with the historian of cryptology David Kahn, and Harold 
Highland, editor emeritus of the journal Computers and Security (who attended 
City College during the nineteen thirties) there is a clear sense that the informal 
discussion groups which took place during that period would have lent themselves 
to consideration of Post’s work. Further indications of such contact were evident 
when, in the course of his address to the November 1988 conference at City 
College, Marvin Minsky observed that Post rewriting methodology had been 
employed during the nineteen sixties on a cryptographic project. 

The shadowy world of secret intelligence has provided scant information for an 
historical investigation of these matters. Even such a distinguished mathematician 
as Peter Hilton reports in [8]: 


“TI am unfortunately obliged to be reticent about the details of the work we 
did at Bletchley Park in breaking the highgrade German cipher (sic during 
WWII). For reasons best known—indeed, almost exclusively known—to 
themselves, the bureaucrats in Washington and Whitehall steadfastly refuse 
to declassify such details.” 


Steven Brams, the noted game theorist and political scientist, has remarked to us 
that the life and legacy of Emil Post represents one aspect of New York intellec- 
tual life during the first half of the twentieth century that is very much in need of 
deeper exploration. The authors hope that this paper serves to further this pursuit. 

The authors would like to thank Martin Davis for supplying us with a prelimi- 
nary manuscript [5] of his biographical and scholarly survey of Post’s life and 
achievements. 
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Dropped Science and took to Divinity. 
—American Mathematical Monthly 
28, (1921) p. 394 
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Famous Nonmathematicians 


Steven G. Buyske* 


We often -tell our students that there are many things besides teaching and 
actuarial work that they can do with a degree in mathematics, but I don’t think 
they believe us. Over the years I’ve put together a list of well-known people who 
were math majors (or some equivalent in other countries and times), although not 
all of them completed their degrees. It’s the most popular thing I’ve ever had on 
my office door. When I began this list, it had mostly contemporary Americans, and 
I called it “People who majored in math.” Some of my students added their own 
names to their copies and posted them on their dorm doors. 
I’d be delighted to hear of any additional names. 


THE PUBLIC REALM. 

Ralph Abernathy, civil rights leader and Martin Luther King’s closest aide. 

Corazon Aquino, former President of the Philippines. She was a math minor. 

Harry Blackmun, Associate Justice of the US Supreme Court, AB summa cum 
laude in mathematics at Harvard. 

David Dinkins, Mayor of New York, BA in mathematics from Howard. 

Alberto Fujimori, President of Peru, MS in mathematics from the University of 
Wisconsin-Milwaukee. 

Ira Glasser, Executive Director of the American Civil Liberties Union, both a BS 
and an MA. 

Lee Hsien Loong, Deputy Prime Minister of Singapore, a Bachelor’s from Cam- 
bridge. 

Florence Nightingale, pioneer in professional nursing. She was the first person in 
the English-speaking world to apply statistics to public health. She was also a 
pioneer in the graphic representation of statistics; the pie-chart was her inven- 
tion, for example. Not really a math major, she was privately educated, but 
pursued mathematics far beyond contemporary standards for women. 

Paul Painlevé, President of France in the early 20th century, and one of the first 
passengers of the Wright Brothers. A ringer: he had a distinguished mathemati- 
cal career. 

Carl T. Rowan, columnist for the Washington Post. 

Laurence H. Tribe, Professor at Harvard Law School, often regarded as one of the 
great contemporary authorities on Constitutional Law. An AB summa cum 
laude in mathematics from Harvard. 

Leon Trotsky, revolutionary. He began to study Pure mathematics at Odessa in 
1897, but imprisonment and exile in Siberia seem to have ended his mathemati- 
cal efforts. 


*T’d like to thank my colleagues and the many people on USENET who have given me names and 
leads. 
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Eamon de Valera, long-time Prime Minister and then President of the Republic of 
Ireland. A ringer: he was a mathematics professor before Irish independence. 


MUSIC. 

Ernst Ansermet, founder and conductor of the Orchestre de la Suisse Romande. 

Pierre Boulez, Modernist composer and conductor. 

Clifford Brown, Fifties jazz trumpeter. 

Art Garfunkel, folk-rock singer. MA in mathematics from Columbia in 1967. 
Worked on a PhD at Columbia, but chose to pursue his musical career instead. 

Phillip Glass, composer, a Bachelor’s from the University of Chicago. 

Carole King, Sixties songwriter, and later a singer-songwriter. She dropped out 
after one year of college to pursue her music career. 

Tom Lehrer, songwriter-parodist. PhD student in mathematics at Harvard. 

Lawrence Leighton Smith, conductor and pianist. 


THE OTHER ARTS. 

Lewis Carroll, author of Alice in Wonderland, Through the Looking Glass, and 
other works. A ringer: he was a logician under his real name, Charles Lutwidge 
Dodgson. 

Heloise (Poncé Cruse Evans), of Hints from Heloise. She minored in math. 

Larry Niven, science fiction writer, winner of the Nebula and Hugo awards. 

Omar Khayyam, author of The Rubaiyat. Another ringer: he published works on 
algebra and Euclid. 

Alexander Solzhenitsyn, Nobel prize-winning novelist, a degree in mathematics 
and physics from the University of Rostov. 

Bram Stoker, author of Dracula, took honors at Trinity University, Dublin. 

Christopher Wren, the architect of St. Paul’s Cathedral in London. 


FINANCE. 

John Maynard Keynes, the great economist. MA and 12th Wrangler, Cambridge 
University. 

J. Pierpont Morgan, the banking, steel, and railroad magnate. Some of the 
Gottigen faculty tried to convince him to become a professional mathematician. 

Ed Thorpe, one of the inventors of program-trading on Wall Street. 


PHILOSOPHERS. 

Edmund Husserl, the “Father of Phenomenology,” PhD 1883 from Vienna. 

Ludwig Wittgenstein, one of the giants of twentieth-century philosophy. Studied 
mathematical logic with Bertrand Russell. 


ATHLETES AND OTHER COMPETITORS. 

Michael Jordan, basketball superstar. He changed to another major in his junior 
year. 

Davey Johnson, manager of the 1986 New York Mets. 

Emanuel Lasker, world chess champion from 1894-1921. Another ringer, he was a 
mathematics professor with several published papers. 

David Robinson, basketball star. BS in mathematics from Annapolis. 

Frank Ryan, star quarterback for the Cleveland Browns in the sixties. PhD from 
Rice. 

Virginia Wade, Wimbledon champion, BS in mathematics and physics from Sussex. 


846 FAMOUS NONMATHEMATICIANS [November 


LITERARY CRIMINALS. 

James Moriarty, former Professor of Mathematics, author of Zhe Dynamics of an 
Asteroid, whose essay on the binomial theorem is said to have had a continental 
vogue, became the leader of the most sinister criminal conspiracy in Victorian 
England. He has been called “the Napoleon of Crime.” Sherlock Holmes’s 
nemesis. 


Mathematics Department 
Lafayette College 

Easton, PA 18042 
buyskes@lafvax.lafayette.edu 


PICTURE PUZZLE 
(from the collection of Paul Halmos) 
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The smile came in 1984, soon after his great victory. 
(see page 883.) 
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The Fundamental Theorem of Linear 
Algebra 


Gilbert Strang 


This paper is about a theorem and the pictures that go with it. The theorem 
describes the action of an m by n matrix. The matrix A produces a linear 
transformation from R” to R™—but this picture by itself is too large. The “truth” 
about Ax = b is expressed in terms of four subspaces (two of R” and two of R”). 
The pictures aim to illustrate the action of A on those subspaces, in a way that 
students won’t forget. 

The first step is to see Ax as a combination of the columns of A. Until then the 
multiplication Ax is just numbers. This step raises the viewpoint to subspaces. We 
see Ax in the column space. Solving Ax = b means finding all combinations of the 
columns that produce b in the column space: 


Pl lod Ws 


Columns of A x 


n 


= x,(column 1) + --: +x,(column n) = b. 


The column space is the range R(.A), a subspace of R”. This abstraction, from 
entries in A or x or b to the picture based on subspaces, is absolutely essential. 
Note how subspaces enter for a purpose. We could invent vector spaces and 
construct bases at random. That misses the purpose. Virtually all algorithms and 
all applications of linear algebra are understood by moving to subspaces. 

The key algorithm is elimination. Multiples of rows are subtracted from other 
rows (and rows are exchanged). There is no change in the row space. This subspace 
contains all combinations of the rows of A, which are the columns of A’. The row 
space of A is the column space R(A’). 

The other subspace of R” is the nullspace NCA). It contains all solutions to 
Ax = (0. Those solutions are not changed by elimination, whose purpose is to 
compute them. A by-product of elimination is to display the dimensions of these 
subspaces, which is the first part of the theorem. 

The Fundamental Theorem of Linear Algebra has as many as four parts. Its 
presentation often stops with Part 1, but the reader is urged to include Part 2. 
(That is the only part we will prove—it is too valuable to miss. This is also as far as 
we go in teaching.) The last two parts, at the end of this paper, sharpen the first 
two. The complete picture shows the action of A on the four subspaces with the 
right bases. Those bases come from the singular value decomposition. 

The Fundamental Theorem begins with 

Part 1. The dimensions of the subspaces. 
Part 2. The orthogonality of the subspaces. 
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The dimensions obey the most important laws of linear algebra: 
dim R(A) = dim R(A’) and dim R(A) + dim N(A) =n. 


When the row space has dimension r, the nullspace has dimension n -— r. 
Elimination identifies r pivot variables and n — r free variables. Those variables 
correspond, in the echelon form, to columns with pivots and columns without 
pivots. They give the dimension count r and n —r. Students see this for the 
echelon matrix and believe it for A. 

The orthogonality of those spaces is also essential, and very easy. Every x in the 
nullspace is perpendicular to every row of the matrix, exactly because Ax = 0: 


—row 1— () 
Ax =|—row 2— |x =]0|]. 
—row m— () 


The first zero is the dot product of x with row 1. The last zero is the dot product 
with row m. One at a time, the rows are perpendicular to any x in the nullspace. 
So x is perpendicular to all combinations of the rows. 


The nullspace N(_A) is orthogonal to the row space R( A’ ). 


What is the fourth subspace? If the matrix A leads to R(A) and N(A), then its 
transpose must lead to R(A’) and N(A’). The fourth subspace is N(A’), the 
nullspace of A’. We need it! The theory of linear algebra is bound up in the 
connections between row spaces and column spaces. If R(A’) is orthogonal to 
N(A), then—just by transposing —the | column Space RCA) is orthogonal to the 
“left nullspace” NCA‘). Look at A’ 


column 1 of A () 
Aly = : y=]: 
column n of A (0) 


Since y is orthogonal to each column (producing each zero), y is orthogonal to the 
whole column space. The point is that A’ is just as good a matrix as A. Nothing is 
new, except A’ is n by m. Therefore the left nullspace has dimension m — r. 

A’y = 0 means the same as y7A = 07. With the vector on the left, y74 is a 
combination of the rows of A. Contrast that with Ax = combination of the 
columns. 


The First Picture: Linear Equations 


Figure 1 shows how A takes x into the column space. The nullspace goes to the 
zero vector. Nothing goes elsewhere in the left nullspace—which is waiting its 
turn. 

With b in the column space, Ax = b can be solved. There is a particular 
solution x, in the row space. The homogeneous solutions x, form the nullspace. 
The general solution is x, + x,. The particularity of x, is that it is orthogonal to 
every X,,. 

May I add a personal note about this figure? Many readers of Linear Algebra 
and Its Applications [4] have seen it as fundamental. It captures so much about 
Ax = b. Some letters suggested other ways to draw the orthogonal subspaces— 
artistically this is the hardest part. The four subspaces (and very possibly the figure 
itself) are of course not original. But as a key to the teaching of linear algebra, this 
illustration is a gold mine. 
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dim r 


column 
space 


nullspace dim m—r 


of A 


dimn—r 


Figure 1. The action of A: Row space to column space, nullspace to zero. 


Other writers made a further suggestion. They proposed a lower level textbook, 
recognizing that the range of students who need linear algebra (and the variety of 
preparation) is enormous. That new book contains Figures 1 and 2—also Figure 0, 
to show the dimensions first. The explanation is much more gradual than in this 
paper—but every course has to study subspaces! We should teach the important 
ones. 


The Second Figure: Least Squares Equations 


If b is not in the column space, Ax = b cannot be solved. In practice we still 
have to come up with a “solution.” It is extremely common to have more equations 
than unknowns—more output data than input controls, more measurements than 
parameters to describe them. The data may lie close to a straight line b = C + Dt. 
A parabola C + Dt + Et? would come closer. Whether we use polynomials or 
sines and cosines or exponentials, the problem is still linear in the coefficients 
C, D, E: 


C+ Dt, =b, C+ Dt,+ Et} =), 
or , 
C + Dt, = 6, C+ Dt,, + Et?, =b,, 


There are n = 2 or n = 3 unknowns, and m is larger. There is no x = (C, D) or 

= (C, D, E) that satisfies all m equations. Ax = b has a solution only when the 
points lie exactly on a line or a parabola—then b is in the column space of the m 
by 2 or m by 3 matrix A. 

The solution is to make the error b — Ax as small as possible. Since Ax can 
never leave the column space, choose the closest point to b in that subspace. This 
point is the projection p. Then the error vector e = b — p has minimal length. 

To repeat: The best combination p = Ax is the projection of b onto the column 
space. The error e is perpendicular to that subspace. Therefore e = b — Ax is in 
the left nullspace: 


A’(b — Ax) =0 or A‘AX =A'D. 


Calculus reaches the same linear equations by minimizing the quadratic ||b — Ax|l’. 
The chain rule just multiplies both sides of Ax = b by A’. 
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The “normal equations” are A74¥ = A’b. They illustrate what is almost invari- 
ably true—applications that start with a rectangular A end up computing with the 
square symmetric matrix A’A. This matrix is invertible provided A has indepen- 
dent columns. We make that assumption: The nullspace of A contains only x = 0. 
(Then A‘Ax = 0 implies x4444x = 0 which implies Ax = 0 which forces x = 0, so 
A’A is invertible.) The picture for least squares shows the action over on the right 
side—the splitting of b into p + e. 


_. Ax=b | 
not possible 


Figure 2. Least squares: % minimizes ||b — Ax||* by solving AZ4% = A7b. 


The Third Figure: Orthogonal Bases 


Up to this point, nothing was said about bases for the four subspaces. Those 
bases can be constructed from an echelon form—the output from elimination. 
This construction is simple, but the bases are not perfect. A really good choice, in 
fact a “canonical choice” that is close to unique, would achieve much more. To 
complete the Fundamental Theorem, we make two requirements: 


Part 3. The basis vectors are orthonormal. 
Part 4. The matrix with respect to these bases is diagonal. 


If v,,...,v, is the basis for the row space and u,,...,u, is the basis for the 
column space, then Av; = o,u;. That gives a diagonal matrix }. We can further 
ensure that o; > 0. 

Orthonormal bases are no problem—the Gram-Schmidt process is available. 
But a diagonal form involves eigenvalues. In this case they are the eigenvalues of 
A'A and AA’. Those matrices are symmetric and positive semidefinite, so they 
have nonnegative eigenvalues and orthonormal eigenvectors (which are the bases!). 
Starting from A7Av, = o;7v;, here are the key steps: 


v/A"Av; = 0/7 0/ v; 


so that ||Au,|l = a; 
AA‘Av, = 0,7Av; sothat u; = Av,/o; is a unit eigenvector of AA’. 


All these matrices have rank r. The r positive eigenvalues o,;* give the diagonal 
entries o; of >. 
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The whole construction is called the singular value decomposition (SVD). It 
amounts to a factorization of the original matrix A into U>V’, where 


1. U isan m by m orthogonal matrix. Its columns u,,...,u,,...,u,, are basis 
vectors for the column space and left nullspace. 

2. > is an m by n diagonal matrix. Its nonzero entries are a, > 0,...,0, > 0. 

3. V is an n by n orthogonal matrix. Its columns v,,...,U,,...,U, are basis 


vectors for the row space and nullspace. 


The equations Av, = g,u,; mean that AV = US. Then multiplication by V7 
gives A = UXV". 

When 4 itself is symmetric, its eigenvectors u,; make it diagonal: A = UAU™. 
The singular value decomposition extends this spectral theorem to matrices that 
are not symmetric and not square. The eigenvalues are in A, the singular values 
are in >. The factorization A = UXV? joins A = LU (elimination) and A = QR 
(orthogonalization) as a beautifully direct statement of a central theorem in linear 
algebra. 

The history of the SVD is cloudy, beginning with Beltrami and Jordan in the 
1870’s, but its importance is clear. For a very quick history and proof, and much 
more about its uses, please see [1]. “The most recurring theme in the book is the 
practical and theoretical value of this matrix decomposition.” The SVD in linear 
algebra corresponds to the Cartan decomposition in Lie theory [3]. This is one 
more case, if further convincing is necessary, in which mathematics gets the 
properties right—and the applications follow. 


Example 
1° 2 3 1}|/¥50 o;]|L-2 1 r 
E | v10 ys v5 


All four subspaces are 1-dimensional. The columns of A are multiples of | | in U. 
The rows are multiples of [1 2] in V7. Both A’4 and AA?’ have eigenvalues 50 
and 0. So the only singular value is a, = ¥50. 


Figure 3. Orthonormal bases that diagonalize A. 
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The SVD expresses A as a combination of r rank-one matrices: 
A = UXV! =u,o! +--+: +u,0,v7 [here 4 ~ Be 2]). 


The Fourth Figure: The Pseudoinverse 


The SVD leads directly to the ‘‘pseudoinverse” of A. This is needed, just as the 
least squares solution x was needed, to invert A and solve Ax = b when those 
steps are strictly speaking impossible. The pseudoinverse At agrees with A7! 
when A is invertible. The least squares solution of minimum length (having no 
nullspace component) is x*= A*b. It coincides with x when A has full column 
rank r = n—then A’A is invertible and Figure 4 becomes Figure 2. 

A™~ takes the column space back to the row space [4]. On these spaces of equal 
dimension r, the matrix A is invertible and A™~ inverts it. On the left nullspace, 
A* is zero. I hope you will feel, after looking at Figure 4, that this is the one 
natural best definition of an inverse. Despite those good adjectives, the SVD and 
A* is too much for an introductory linear algebra course. It belongs in a second 
course. Still the picture with the four subspaces is absolutely intuitive. 


column 
space 


p= Axt 


nullspace 
of A? 


nullspace 
of A 


Figure 4. The inverse of .A (where possible) is the pseudoinverse A*. 


The SVD gives an easy formula for A*, because it chooses the right bases. Since 
Av; = 0;u;, the inverse has to be A*u; = u,/o;. Thus the pseudoinverse of >} 
contains the reciprocals 1/o;. The orthogonal matrices U and V’ are inverted by 
U? and V. All together, the pseudoinverse of A = UV" is At= VXtUT. 


Example (continued) 
E al | 1 4 
Ata al dea 7 al! 3] 
v5 0 of v10 5012 6 
Always A*A is the identity matrix on the row space, and zero on the nullspace: 


1/10 20 1 
+ _ — e . ° 
A‘A = 50 E “0 projection onto the line through Bi 
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Similarly AA~ is the identity on the column space, and zero on the left nullspace: 


[2 15 


+o 
AA 50,15 45 


= projection onto the line through 31. 


A Summary of the Key Ideas 


From its r-dimensional row space to its r-dimensional column space, A yields 
an invertible linear transformation. 


Proof: Suppose x and x’ are in the row space, and Ax equals Ax’ in the column 
space. Then x — x’ is in both the row space and nullspace. It is perpendicular to 
itself. Therefore x = x’ and the transformation is one-to-one. 


The SVD chooses good bases for those subspaces. Compare with the Jordan form 
for a real square matrix. There we are choosing the same basis for both domain 
and range—our hands are tied. The best we can do is SAS~' = J or SA = JS. In 
general J is not real. If real, then in general it is not diagonal. If diagonal, then in 
general S is not orthogonal. By choosing two bases, not one, every matrix does as 
well as a symmetric matrix. The bases are orthonormal and A is diagonalized. 

Some applications permit two bases and others don’t. For powers A” we need 
S~* to cancel §. Only a similarity is allowed (one basis). In a differential equation 
u' = Au, we can make one change of variable u = Sv. Then v’ = S~'ASuv. But for 
Ax = b, the domain and range are philosophically “not the same space.” The row 
and column spaces are isomorphic, but their bases can be different. And for least 
squares the SVD is perfect. 

This figure by Tom Hern and Cliff Long [2] shows the diagonalization of A. 
Basis vectors go to ‘basis vectors (principal axes). A circle goes to an ellipse. The 
matrix is factored into UXV". Behind the scenes are two symmetric matrices A7A 
and AA’. So we reach two orthogonal matrices U and V. 


A 
yr > O7€o 
Uo €2 
v1 e1 O14 
eS” 
V 


We close by summarizing the action of A and A?’ and A?: 
Av,;=o0,u;  <A’u;=o,v0; A*u;=0,/o, 1<i<r. 


The nullspaces go to zero. Linearity does the rest. 
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An Identity of Daubechies 


The generalization of an identity of 
Daubechies using a probabilistic interpre- 
tation by D. Zeilberger [100 (1993) 487], 
has already appeared in SIAM Review 
Problem 85-10 (June, 1985) in a slightly 

' more general context. In addition to a 
similar probabilistic derivation there is 
also a direct algebraic proof. Incidentally, 
problem 10223 [99 (1992) 462] is the same 
as the identity of Daubechies and a slight 
generalization of this identity has ap- 
peared previously as problem 183, Crux 
Math. 3(1977) 69-70 and came from a list 
of problems considered for the Canadian 
Mathematical Olympiad. There was an 
inductive solution of the latter by Mark 
Kleinman, a high school student at the 
time and one of the top students in the 
U.S.A.M.O. and the I.M.O. 


M. S. Klamkin 
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A Simple Proof of the Jordan-Alexander 
Complement Theorem 


Albrecht Dold 


The complements of homeomorphic subsets A, B Cc R” of Euclidean space need 
not be homeomorphic, A ~ B + (R” — A) = (R” — B). This is well illustrated by 
classical knot theory, ic. when A,B are knots in R°. The complements usually 
have different fundamental groups in this case, 7 ,(R° — A) # 7,(R° — B), and this 
fundamental group serves to distinguish non-equivalent knots. 

On the other hand, it is a classical consequence of Alexander duality (cf. [D], 
VIII, 8.15) that the homology groups of the complements agree if A,B are 
homeomorphic closed subsets of R”. Thus, 


Theorem. Jf A, B C R” are homeomorphic closed subsets then their complements 
have isomorphic homology groups, H(R" — A) = H(R” — B),—also in generalised 
(co-)homology. 


If the coefficients of homology are taken in a commutative ring R with 1 then 
the rank of H,(R” — A) equals the number of components of ,R” — A (almost by 
definition of H,). Therefore, 


Corollary. The complements of homeomorphic closed subsets A, B CR" have the 
same number of components. 


If A = {x € R"||lxl| = 1} = §"~1, the Jordan separation theorem is: Every 
subset B Cc R”, n > 1, which is homeomorphic to S”~! separates R” into two 
regions. 

In this note we give a simple proof of the theorem. It uses basic properties only 
of homology, namely homotopy invariance and Mayer-Vietoris sequences of open 
subsets of Euclidean spaces. The reader might take singular or simplicial homology 
but the proof also works in general (co-)homology—no dimension axiom is 
required. No priority is claimed for this note. Its methods are familiar in topology 
and algebraic geometry; the intention is to publicize an elegant argument. 

It is convenient to use reduced homology in the proof. The reduced homology 
HX of a non-empty space X is the kernel of the homomorphism HX — HP which 
is induced by the map X — P onto the one-point space P. If we choose a point in 
X, which we write as a map P > X, then the composition P — X — P is the 
identity map, hence HX = im(HP > HX) ® ker(HX > HP) = HP ® HX. Thus, 
HX differs from HX by the constant summand HP only. In particular, HP = 0 
and by homotopy invariance, HX = 0 for every contractible space X. 


Proposition 1. For every closed subset A < R", A # R", we have AAR" — A) = 
H,, (R"*! — A), where R"*! = R"X R, R" = R"X 0} CR, 
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Proof: Put Z = R"*! — A, 
Z,.={(x,t) © R” X Rit > 0, or x € (R” — A)}, 
Z_= {(x,t) © R"**|t < Oor x € (R” — A)}. 
Then Z,,Z_ are open in Z, 
Z,UZ,=Z,Z,NZ_=(R"-A) XR. 


Furthermore, Z, and Z_ are contractible (the deformation (x, t) 
(x, -—7)t +7), 0 <7 <1, moves Z, into the hyperplane ¢ = 1 which in turn 
deforms into a point), hence H(Z,,) = 0 = H(Z_). The reduced Mayer-Vietoris 
sequence (which is the ordinary Mayer-Vietoris sequence without the superfluous 
constant summands HP; cf. [D], III, 8.15) has the form 


Hj4(Z4) © Ajsi(Z_) > Aya(Z4U Z_) > AZ, Z_) 
> H(Z,) ®H,(Z_). 
As it is exact and H(Z,) = 0 = H(Z_) it amounts to an isomorphism H,, (Z, 


UZ _)= AMZ, 1 Z_). But Z,U Z_ Re and Z,9 Z_= (R” - —A)XR; 
the latter deforms into (R” — A) x {0} = — A. a 


Iterating Propostion 1 we get 


Proposition 2. For every closed subset ACR", A#R", and every q>= 0, 
H,,, (R"*? - -A)= AAR" — A). | 


Proposition 3. Jf A CR’, B C R? are closed subsets which are homeomorphic then 
the complements of A = A X {0} and-B = {0} X B in R?*? = R? X R® are also 
homeomorphic, (R?*? — A) =~ (R?*4 — B), 


Proof: (Compare [FM], §3). Let @: A = B: wW be reciprocal homeomorphisms, 
wp = 1,4, pw = 13. By Tietze’s Lemma, these extend to continuous maps ®: 
R? 2 R%: WV. The maps L,R: R? xX R? > R? X RY, Lx, y) = (x, y — B(x), 
R(x, y) = (x — WV(y), y) are self-homeomorphisms of R?*?, and they map the 
graph T = {(x,y) © R? x Rie EA,y= o(x)} = {(x,y)ly © B, x = W(y)} onto 
A resp. B. Hence (R?*+? — A) = (Rett — T) £ (R?+4 — B). a 
Proof of the theorem. If both A and B are # R” we apply propositions 2, 3, 2 in 
this order, A(R" — A) = A,,,(R"*" — A) = H,,,(R"*" — B) = HAR" - B). 
Adding H;P to both ends gives H,(R” — A) = HAR” — B), as required. 

If A = R" we still have AR"? —-A)= AY(R"*} — B) by the same argument; 
in particular, H,(R"*! -~A)=H o(R"** — B). But R"*!— 4 =R"*!— R” has 
two components and R”*+! — B has only one component—unless B = R”. There- 
fore, (in ordinary homology), rank (H,(R”*!—A)) =1 and rank (H,(R"* - 
B)) = 0O—unless B = R”. Therefore, B = R” = A. | 
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Squaring the Circle with Holes 


Hansklaus Rummler 


1. WALLIS’ PRODUCT. Among the approximations of 7, Wallis’ product 
wT 2 2 4 4 6 6 


is perhaps the most fascinating one. Sure, it is not really useful in calculating 7, 
the product converging very slowly. But the formula is already interesting for its 
history: Wallis’ somewhat mysterious—or even mystic—discovery of the formula 
inspired Newton to similar calculations, leading finally to the binomial series (see 
[1)). 

Nowadays, the proof of Wallis’ formula has become a standard exercise: 
Calculating the integral 


for every natural number m leads to 
a2 1 3 2-1 2 4 6 2n 
an 22 4 2n 


and from this Wallis’ product formula is easily derived. 
An alternative proof is obtained by taking z = ; in the Weierstra8 product 


00 z* 
sin(wz) = 7z| | [i — =}. 
v=1 V 


Unfortunately, neither proof helps to understand the formula. To explain what 
we mean by understanding a formula, let us consider Vieta’s formula: 


2 1 1 1 /1 1 1 /1 1 /1 
—=\5 + 5 t5V5 ~ += 5 t5V5 
7 2 2 2) 2 2 2y¥ 2 27V 2 


The factors of this infinite product are much more complicated than those of 
Wallis’ product, but they have a simple geometric meaning, because they represent 
length ratios: If /, denotes the length of a regular 2”-gon inscribed in the unit 
circle, it can be shown that the factors of Vieta’s product are just the ratios 


858 SQUARING THE CIRCLE WITH HOLES [November 


[,:l,4,, and the formula is immediately clear: 


(The inscribed 2-gon is a diameter, counted twice.) 


2. WALLIS’ SIEVE. Instead of trying to understand Wallis’ product formula in the 
Same sense we understand Vieta’s formula, we shall interpret it, constructing a 
subset of the unit square that is easily seen to have area § - = + 4 -:- , which, by 
Wallis’ product formula, is just the area of the inscribed disk, namely 7/4. 

In order to construct this set, let us say that we punch a hole of order n into a 
square, n being an odd integer, if we take away the middle open one of the n? 
congruent small squares into which we can decompose the given square. 

Now take a compact unit square and punch a hole of order 3 into this square. 
The remaining set W, has of course area 


8 
u(W,) = 9° 


Punching a hole of order 5 into each of the 8 small squares forming W,, we get a 
set W, consisting of 8 - 24 small squares and with area © 


8 24 


W,)=-—-'—. 
rea 2) 9 5 

Continuing in this way by punching holes of order 7, 9, 11 and so on, we get 
finally Wallis’s sieve, a compact set W, with area 


8 24 48 7 


MWe) = 9° 55° Go 


The following figures show the first three steps of our construction: 


Figure 1 
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Figure 2 


3. WALLIS’ SIEVE AND LEBESGUE MEASURE. So far, there seems to be no 
problem in calculating the area of Wallis’ sieve W,, and by construction this area is 
just u(W,) = 4-2 -B--- =77/4. But we have to be careful: area here means 
Lebesgue measure, because W, is not measurable in Jordan’s sense, its interior 
measure being 0. W, does not even contain any product set A X B with A, BCR 
having positive Lebesgue measure. To see this, consider a maximal product subset 
of W,, for instance [0, 1] x (0, 3] U [4,1]. This subset has measure 2, a maximal 
product subset of W, has measure = - =, and so on. Therefore, a maximal product 
subset of W, has measure 4-2-2--- =0. 

Thus, Wallis’ sieve W, is an example of a subset of the plane R* with positive 
Lebesgue measure, but not admitting any product subset with positive Lebesgue 
measure. 
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Fermat’s Last Problem 


An Englishman named Wiles discovered the key, 
To Fermat’s Last Problem using geometry. 
By proving the sum of two powers, 


Is a number to the power. 
If and only if the power is smaller than three. 


—Nats Wolraf 
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NOTES 


Edited by: John Duncan 


Simplifying the Proof of Dirichlet’s 
Theorem 


Paul Monsky 


Dirichlet showed that an arithmetic progression a,a + D,a + 2D,... with D>1 
and (a, D) = 1 contains infinitely many primes. Most of his argument is accessible 
to undergraduate mathematics majors, but a proof of the theorem is seldom 
presented to them because of the reputed difficulty of a key step—showing that 
certain infinite sums are non-zero. This note outlines a simple proof of the 
non-vanishing of these sums. The argument is very close to one given by Gelfond, 
[1], but is easier and works well in the classroom. 

The sums I'll treat may be described as follows. A ‘“‘character to the modulus 
D” is a function y: Z > C satisfying: 

(1) If a = b(D), then y(a) = y(b) 

(2) y(ab) = y(a)x(b) 

(3) y(a) = 0 if and only if (a, D) > 1. 

y is said to be real if it takes real values (which can only be 1, —1, or 0), 
non-principal if it takes values other than 0 or 1. Suppose for example that D is an 
odd prime. Then the “Legendre symbol’’, taking each quadratic residue of D to 1, 
each non-residue to —1 and each multiple of D to O is a real non-principal 
character. For a non-principal y, Ljyv(n)/n converges. (This follows from summa- 
tion by parts; see the argument given in the last paragraph of this note.) 

The usual approach to proving Dirichlet’s theorem involves several standard 
analytic techniques (see [2], for example); the main non-formal step is showing that 

+x(n)/n # 0 whenever y is real and non-principal (the result is also needed for 
non-real y, but this is fairly easily handled). Dirichlet’s original non-vanishing 
proof involved a detour through the theory of binary quadratic forms. Modern 
proofs generally use ideal theory in quadratic number fields or some complex 
variable theory. Elementary proofs are also known, but are more complicated than 
the one I’ll now present. 

One begins by defining c, to be Uy(d) where d ranges over the positive divisors 
of n. Evidently cj. = 1+ y(p) + y(p)? + +++ +x(p)* = 0. It follows easily that 
c, > 0 for all n. Furthermore, c,, = 1 whenever n is a power of a prime p dividing 
D. In particular Lic, = ©. 

Next, following [1], one sets f(t) = Liy(m)t"/( — t”). The series evidently 
converges in [0, 1). Expanding each t”/(1 — t”) one finds that f(t) = Ujc,t”. The 
paragraph above shows that f(t) ~ «as t > 17. Suppose now that Li y(n)/n = 0. 
Then —f(t) = Xfx(M{1/nd — t)} — {t"/C — t}; write this as LP y(n)b,. 


The critical observation is that b, > b, >b, > ::: . Note first that 

1 1 tn pntl 
1—t)(b. —b Se _—— 
( Mn = Pnst) no n+l L+t4+-:- 4477) LH+etters +t” 


1 t" 
~n(n+1)) (1ttt-° +e" (140+ 05) $8") 


1993] NOTES 861 


Since (1 +¢+ ++: +2""') ont?) ont"? while (Q+t+--: +t >(n+ 
1)t”/* (this is the inequality of the arithmetic and geometric mean), b, — b,,, > 0. 

Now y is periodic of period D, and X?y(n) = 0. So the numbers y(1), y(1) + 
x(2), x) + x2) + x(3),... are bounded in absolute value by D. Since b,~ 0, the 
standard Abel rearrangement of the infinite sum Lp y(n)b, shows that |XPy(n)b, | 
< Db, = D, contradicting the unboundedness of f on [0, 1). 
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Why is P? Not Embeddable in R*? 


Hiroshi Maehara 


The projective plane P? is the closed surface obtained by pasting a MGbius band 
and a 2-cell together along their bouridaries. The surface P* is not embeddable in 
the 3-dimensional Euclidean space R*. Though this fact is well known, no handy 
proof seems to be furnished yet. (A proof in Spanier [3], for instance, requires 
cohomology theory.) Here, we offer a short and clear-cut proof of the non-embed- 
dability of P? in R° by applying the Link Appearing Theorem. 

Our figures in R* are assumed to be tame. (A figure X in R? is tame if there 
exists a homeomorphism f: R*® > R° such that fCX) is a polygonal or polyhedral 
figure.) Thus, we consider only tame embeddings. A (2-component) link is an 
embedding of a pair of circles in R°. Let us call a link trivial if one of the two 


Figure 1. Figure 2. 
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curves bounds a 2-cell in R° that is disjoint from the other curve. Otherwise, it is 
non-trivial. 

Now, consider a set of six points in R°, and assume that each pair of these 
points is connected by a simple curve such that the curves meet only at their 
endpoints. Such a figure is called a complete 6-graph and is usually denoted Kg. 
Fig. 1 shows a K, in which six points are indicated by 1, 2,...,6. A simple closed 
curve in a K, is called a cycle of the K,. We indicate a cycle by a sequence of 
points in the order of appearing when we trace the cycle. In the K, of Fig. 1, the 
pair of cycles 135 and 246 forms a non-trivial link. 


Link Appearing Theorem. Any complete 6-graph in R° contains a pair of disjoint 
cycles that forms a non-trivial link. = 


This theorem was proved by Sachs [2] and independently by Conway-Gordon 
[1]. Its proof is not difficult, see [1] or [2] for the detail. 

In a rectangular representation of a MGbius band M, the line segment connect- 
ing the midpoints of the to-be-identified sides (the dotted line in Fig. 2) represents 
a simple closed curve in the M6bius band M. This closed curve is called the 
meridian of M. 


Lemma. For any embedding of a Mébius band M in R?, the pair (@M,C) of the 
boundary 0M and the meridian C of M forms a non-trivial link. 


Proof: Consider the K, on the Mobius band M represented in Fig. 3. (Each pair 
of the six points 1,2,...,6 is, indeed, connected by a simple curve. For example, 
the line segment from the point 1 to the right-top e and the line segment from the 
left-bottom e to the point 3 make together a simple curve connecting 1 and 3.) 
This K, contains ten pairs of disjoint cycles: 


(123, 456), (124, 356), (125, 346), (126, 345), (134, 256), 
(135, 246), (136, 245), (145, 236), (146, 235), (156, 234). 


Each underlined cycle bounds a 2-cell in M that is disjoint from its partner cycle. 
For example, the cycle 135 bounds the 2-cell shaded in Fig. 3. Hence, in any 
embedding of the Mdbius band M in R°, nine pairs of cycles of K, other than 
(134, 256) are trivial links. Therefore, (134,256) must be a non-trivial link by the 
Link Appearing Theorem. The cycle 256 is the meridian of M, and the cycle 134 is 
the boundary 0M. a 


J RP 
~V/\— 


Figure 3. 


1993] NOTES 863 


Proof of the non-embeddability of P? in R?. Suppose P? is embedded in R°. By 
removing an open 2-cell D from the surface P*, we have a Mobius band M. Then, 
the boundary 0M and the meridian C of M form together a non-trivial link. 
Therefore, C and the 2-cell D must intersect each other, a contradiction. a 
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Polynomial Root Dragging 


Bruce Anderson 


1. INTRODUCTION. Rolle’s Theorem and other results (such as those found in 
M. Marden [1] and anthologized in E. Barbeau [2]) furnish insight about the 
location of the zeros of the derivative of a polynomial (i.e. the critical points) 
relative to the location of the zeros of the polynomial. These results tend to be 
“static” in that they indicate where the critical points should be expected within 
certain bounds defined by the fixed location of the roots of p(x) (e.g. within 
intervals bounded by the roots of the polynomial for real roots, or within a complex 
hull for the complex case). This paper will, in contrast, explore a simple ‘“dynamic”’ 
result, showing how the roots of the derivative will be “affected” as we move (or 
drag) the roots of the polynomial, provided all the roots are real. The results are 
given in Theorem 2.1 and Corollary 2.2. We then show that this result does not 
generalize to complex roots in the obvious way. 

The root dragging result of Corollary 2.2 is then employed to address the 
questions: Do the quartic polynomials produce all possible arrangements of critical 
points which satisfy Rolle’s theorem? Or are there additional constraints on the 
possible arrangement of real critical points for quartics? Theorem 3.1 will furnish 
the perhaps surprising answer. 


2. ROOT DRAGGING. Let p(x) be a polynomial of degree n with all real distinct 
roots x; <x, < ‘:: <x,. Suppose we “drag to the right” some or all of these 
roots. I.e. we construct a new nth degree polynomial q with all real distinct roots 
X44 <X%,< +++ <x}, such that x; > x, for all integers i between 1 and n. The 
derivatives of p and q, which of course are polynomials of degree n — 1, must also 
have all real distinct roots from Rolle’s theorem. Let z, <z,< ++: <z,_, and 
2, <2,< +++ <2z/,_, be the roots of p’ and q’, respectively. (By Rolle’s theorem, 
Xp <2 <Xy4, and x, <z, <x',,, for all integers k between 1 and n — 1). 
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Theorem 2.1. (Root Dragging Theorem). The roots of q' will each be to the right of 
the corresponding roots of p’; i.e. z', > z, for all integers k between 1 and n — 1. 


Proof: Our analysis will be in the spirit of the proof of the Gauss-Lucas Theorem 
found in Marden [1]. We suppose there is some k such that the corresponding 
roots, z, and z,, are not in the order guaranteed by the theorem, i.e. z, < z,. We 
show this leads to a contradiction. As shown in Marden [1], we know that the root 
z, Of p’ must satisfy the equation: 


n 1 

Pp» a = 0. (1) 
Likewise, the root z, of q’ must satisfy: 
n 1 

Lao? ) 


But since x; > x, and z, < z, (by assumption) we conclude: 
Zp Xi <2 7X; (3) 


Now, since z, lies between x, and x,,, and z, lies between x, and x,.,, both 
sides of inequality (3) will be of the same sign. Thus 


1 1 


(4) 
Since this is true for all i, sums (1) and (2) cannot both equal zero. Q.E.D. 


Corollary 2.2. Let p and q be the same polynomials described in the theorem above. 
The roots of any derivative q\? will each be to the right of the corresponding roots of 
p. Le. if we shift roots of p to the right, the roots of all its derivatives will also shift 
to the right. 


Proof: Follows easily by induction on j. Q.E.D. 


Remarks 2.3. (i) With a little care the requirement that the roots be distinct (i.e. 
no multiple roots) may be dropped. (ii) Essentially Corollary 2.2 says that the roots 
of the derivatives of a polynomial “follow” the roots of the polynomial (assuming 
all the roots are real). A more refined analysis which will not be presented here 
gives the following result: The roots of the derivatives will all move faster than the 
slowest moving root of the polynomial and slower than the fastest moving root of 
the polynomial. 


3. APPLICATION OF THE ROOT DRAGGING THEOREM. Let p(x) be a 
fourth degree polynomial whose (four) roots are all real and distinct. Call the inner 
two roots a, and a,. Now by Rolle’s theorem, p’(x) must have exactly three real 
distinct roots. Call the middle root b. Iterating Rolle’s, p”(x) must have two real 
distinct roots (which we call c, and c,), and p® must have one real root, d. By 
elementary analysis, d will be the average value of the four roots of p(x). 


Theorem 3.1 (Unconstructible fourth degree polynomial). If a, <c, and a, <c, 
then b < d. 
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Remarks 3.2. Figure 1 illustrates the arrangement of roots which Theorem 3.1 
states is unconstructible. Here ‘‘0” represents the location along the real number 
line of a root of p, “1” represents a root of p’, and so on). 


0 0 0 0 
l \ l \ l 
2 / 2 
3 


Figure 1. Theorem 3.1 states that this arrangement of roots is unconstructible, where ‘‘0” represents 
the location of a root of p(x), “1” represents the location of a root of p’(x) and so on. 


Proof of Theorem 3.1: Begin shifting the right-most “0” to the right. Since the 
location of the “3” is the average of the “0’s” and since the middle “1” must lie 
between the second and third “0’s’”’ by Rolle’s Theorem, the ‘‘3” must eventually 
line up with the middle ‘‘1” as we continue shifting the right-most ‘“‘0” to the right. 
Meanwhile, by Corollary 2.2, the 2’s must shift to the right. Thus if the polynomial 
represented by Figure 1 is constructible, then so must Figure 2. 

This means the polynomial must be symmetric around the middle ‘1’, since we 
have a fourth degree polynomial with the first and third derivatives equal to zero 
there. But clearly the ordering depicted in Figure 2 is not symmetric. Q.E.D. 


0 0 0 | 0 


1 \ 1 \ ! 
2 | 2 
3 


Figure 2. By shifting the right most “0” to the right, we will eventually reach this arrangement of roots. 


4. GENERALIZATIONS. One might ask whether there is an obvious complex 
generalization to Corollary 2.2. If we have a polynomial with complex roots, and 
we move all the roots in one direction, will the roots of the derivative all follow? 
The answer, as expected, is no, as illustrated by the following counterexample: 
Take the third degree polynomial p which has complex roots i, —i, and a real root 


of value 2. One can check that the roots of p’ are ; and 1. But if we shift to the 


right the real root of p from 2 to say 3, the roots of the derivative become 1 — Z 


and 1 + V2 . Since 
2 1 2 
1-;y/- <-—-<1<1+y)- 
3 3 3 
the two roots of the derivative did not both shift to the right. 
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COMPUTER SCIENCE SAMPLER 
Edited by: Catherine C. McGeoch 


Parallel Addition 


Catherine C. McGeoch 


If you set nine women to digging a ditch they will complete it in one-ninth the time 
required by a single woman. But nine women working together cannot bear a child 
in one month. The moral: some tasks can be parallelized and some cannot. 

Can addition be parallelized? If one person can add two n-digit integers in n 
seconds, can n people add them in one second? It appears that n-digit addition 
requires m seconds no matter how many people are working on it, since the 
high-order digits cannot be added until the high-order carry-in is known. But in 
fact there does exist a method for adding integers in about 2log, n steps (using n 
people). The method is called carry-lookahead addition and is incorporated into 
the circuitry of nearly all modern computers. In this column we will look at 
carry-lookahead addition as well as an interesting parallel method for adding three 
integers. 

We shall work with nonnegative n-digit integers expressed in binary (base two). 
Let X and Y be two such integers and let Z be their n + 1-digit sum. The digits 
of X are denoted x,_,%,_>...%,% , and the digits of Y and Z are denoted 
similarly. Let C =c,c,_,...C,Cg represent the carry digits: that is, c,; is the 
carry-in added to x, and y,, and equivalently, the carry-out generated by adding 
X;-1, Y;-, and c;_,. We include c, for notational convenience, recognizing that 
Co = 0 always. 

Figure l-a contains a table defining one-digit binary addition with carry-ins and 
carry-outs. On the sixth line of the table, for example, we see that 1 + 0+ 1 = 10 


x y Zz 
0 0 0 0 
0 0 1 1 
0 1 0 1 
0 1 1 0 T ppgkpegk 
1 0 0 1 C 1110110 
1 0 1 0 xX 101001 
1 1 0 0 Y +011011 
I 1 1 1 Z 1000100 


o~ 
be) 
ww 
a 
oC 
~ 


Figure 1 
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in base two. Figure 1-b gives an example 6-digit sum, showing C, X, Y, Z and a row 
labeled T (described below). The usual way to add is to apply the one-digit 
function to the digits of X, Y, and C in turn as i goes from 0 to n — 1. The 
amount of time this takes (assuming constant time for each one-digit addition) is 
proportional to n. To achieve faster parallel addition we have to try something 
else. 


Carry Lookahead Addition. Notice that we can sometimes calculate c; without 
waiting to know the value of c,_,. If x;_, and y,_, are both 0, then c; must be 0, 
no matter what value c,_, takes. Similarly, if x,;_, and y,_, are both 1, then c; 
must also be 1. The only problem arises when exactly one of x;_, and y;_, is 1, in 
which case c, can’t be determined until c;_, 1s known. 

We construct a carry status function f, to reflect this situation. The carry status 
is expressed in terms of three functions k, g and p, each with domain {0, 1}. They 
are called the kill function, defined by k(c) = 0; the generate function g(c) = 1; 
and the propogate function p(c) = c. (You may recognize them by other names.) 
The carry status function is defined by 


k(-) if x;,=y,-, = 0 
fi(-)=§8C) thx. =y-1= 1 
PC’) if x;-1 #Y;-1 
-Row T in Figure 1-b shows the carry status functions for the example sum. It is 
easy to verify that c, = f(c,;_,). Furthermore, we can apply function composition 


to obtain c; = f;° f;_,(c;_2). The handy table below shows the nine possible results 
of composing pairs of functions from {k, g, p}. 


In general, c; =f,°f;-,° °** ° f(c;_,) for i>j > 0. In particular, we have 
c, =f,° +++ ° f,(0), since by definition cy = 0. We will adopt the shorthand nota- 
tion [i,j] to refer to a sequence of compositions f,;° --: °f,. In this notation 
f, = li, i] and c; = [i, 1)(0) for i between 1 and n. We stretch the notation slightly 
to let.cy = [0, 1)(0) = 0. 

Carry-lookahead addition uses a clever two-pass scheme to find all the carries c; 
quickly. In the first pass several compositions [i, j] are calculated. In the second 
pass, functions of the form [i, j(c,_,) are evaluated, one for each i between 1 and 
n. Once all the carry values c,; = [i, j(c;_,) are known, the individual sums 
x, + y, +c; can be found simultaneously to produce the digits of Z. 

Figure 2 shows a combinational circuit for performing carry-lookahead on 8-bit 
integers. The circuit comprises several nodes connected together by directed wires. 
The wires carry values: a node sends a value on its output wire according to some 
fixed function of values on its input wires. We require that each node have a fixed 
number of input wires and output wires, and that each node execute its function(s) 
in a fixed amount of time. 

The circuit contains 7 oval nodes arranged in a binary tree, 1 circle node 
attached to the root of the tree, and 8 square nodes at the leaves of the tree. These 
different types of nodes perform different functions. In general, n — 1 ovals, 1 
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co ———| | cg 


co | Ti8,1 
(8, 5] [4,1] 
c4 Co 
[8, 7] [6, 5) [4, 3] [2, 1] 
C6 C4 C2 CO 


is,alf] [T.71 i6,6lT| | TIS, 5] 441T]  |*3,3) 22aT] [ht 
PL PL PL PIL 


C7 C6 CS C4 C3 C2 Cl CO 


Figure 2 


circle, m squares, and nm more adder nodes (not shown here) are required for n-bit 
addition when n is a power of 2. 
We can now add X and Y as follows. 


Step 1. The nm square nodes calculate [i,i] = f,(-), for iin 1...n, each using 
inputs x,_, and y,_, (not shown in Figure 2). Each square node sends the 
appropriate value k, g, or p along its output wire going up. This step requires @(1) 
time’ since the square nodes can operate simultaneously. 


Step 2. Each oval node performs the composition [i, j] = [i, k]o[k — 1, j] going 
up. That is, the function values for [i, k] and [k — 1, j] (each is either k, g, or p) 
are obtained from neighbor nodes below and the result [i,j] is passed to the 
neighbor above. The circle node at top eventually receives [n,1]. The arrows 
pointing up in Figure 2 are labelled to show the flow of values in this step. Overall, 
the time required for values to move from the square nodes (where the initial [i, 7] 
values are located) to the circle node (the last one to receive a value) is O(log 7). 


Step 3. The circle node evaluates [n, 1](0), equivalent to c,. It also passes cy = 0 
down to the root oval. Each oval node evaluates 
Cy = [k- 1, 7 ](¢;-1) 

going down. To accomplish this the node retains [kK — 1,j] from Step 2 and 
receives c;_, from the neighbor above. The result c,_, is passed down to the left 
neighbor, and c,, is passed down to the right. The arrows pointing down in 
Figure 2 are labeled to show the flow of values in this step. The total time required 
for values to propogate from the circle node down to the square nodes is O(log 7). 


‘The notation @( f()) means “proportional to f(n)” in the following sense: g(n) = @(f(n)) means 
that there exist positive constants a and b and ng such that for all n > ng we have af(n) < g(n) < bf(n). 
For example any constant function is @(1) and any function of the form d log, n + e (for constants d 
and e) is O(log n). 
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Step 4. The circle node has computed c,, and the square nodes now hold the carry 
values c; for i between 0 and n — 1. These values can be passed to an array of 
adder nodes that simultaneously perform the 3-bit additions x; + y; + c; (discard- 
ing the carry-outs) to obtain the digits z; of Z. This final step takes @(1) time. 


The total amount of time required for the lookahead circuit and the adder array 
to form the sum of X and Y is T(n) = L(n) + A(n), where L(n) = O(log n) (to 
find the carry digits by lookahead) and A(n) = @(1) (to add the one-bit triples). 
Therefore T(n) = @(ogn). Note that although the circuit contains 3n nodes, 
carry-lookahead addition could be performed by n people acting as nodes, since 
people can move around and change functions. 


Carry Save Addition. Now, how long does it take to add three n-digit integers W, 
AX, and Y? We could certainly add Y to the n + 1-digit sum of W and X;; this 
would require 27(n + 1) time if we use a lookahead-adder circuit of size n + 1 
twice. A better idea is to apply carry save addition, which only requires ®(1) + 
T(n + 1) time. 

Given n-digit W, X, and Y, a carry save adder constructs two intermediate 
integers, an n + 1 digit U and an n-digit V, such that U + V=W+4X+4+/Y. Then 
U and V are summed with a carry-lookahead adder circuit of size n + 1. 

Here’s how it works. Referring again to Table 1-a, let the binary integer uv 
denote the 2-digit sum of three 1-digit binary integers; that is, w+x+y = uv. 
Then it must be the case that w+x+y=2u +0. 

For each i in0... — 1, apply the function in Table 1-a to the digits w,, x;, and 
y, of W, X, and Y. Set uv; = z and uw’, = c,,,, aS labelled in the table, and let v,; and 
u’, denote the digits of V and U’, respectively. Let U be defined by wu, = 0 and 
u,;=u,_, for i in 1...n. Then V and U = 2U’ are the desired intermediate 
integers, since 


n—-l 
W+X4+Y= DY (w, 4+; + y;)2' 
1=0 


n—-1l 
= )) (2u’, + v;)2' 

i=0 

n—-1l . . 
= > u',2'*} + v;2' 

i=0 

n—-1l . n—-1 . 
= De ujpa 2 + DE 0,2! 

i=0 i=0 
=U++YV. 


The digits v,; and u,; can be calculated simultaneously by n adder nodes in @(1) 
time. After that, U and V can be added by a carry-lookahead circuit of size n + 1 
in T(n + 1) time. 


Further Reading. Carry-lookahead addition and carry save addition have been 
around since the middle 1960’s. We have since figured out how to parallelize 
several other arithmetic operations. For example, carry-save addition can be 
generalized so that a circuit containing @(mm) modes can be used to add m n-digit 
numbers in @(log, m + log, n) time steps. This implies that two n-digit numbers 
can be multiplied in O(log, n) time steps. We also know that under the standard 
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formal model of parallel computation it is not possible to add two n-digit numbers 


in @(1) time. 


Two recent texts by Cormen et al. [1] and by Leighton [2] give excellent 
discussions of parallel arithmetic. The search for efficient parallel algorithms for 
general computational problems is a vigorous research area of theoretical com- 
puter science; Leighton’s text, in particular, gives a comprehensive view of the 


state of the art. 


ACKNOWLEDGMENT. This seems a good time to thank Dan Velleman for outstanding service as a 
“typical mathematical audience”. Dan’s insightful suggestions and comments on draft columns are most 
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Word has reached this country that the 
Editor of the Zentralblatt fiir Mathematik 
und ihre Grenzgebiete, Professor Otto 
Neugebauer, now of Copenhagen, has re- 
signed. The resignation from this mathe- 
matical abstracts journal was occasioned 
by the action of the publisher, Julius 
Springer of Berlin, in dropping Professor 
Levi-Civita of Italy from the board without 
the knowledge of the Editor, as well as by 
the demand that the Editor give assurance 
that no emigrants. would be allowed to 
referee articles by German authors. In 
consequence of this interference with edi- 
torial policies, the American associate edi- 
tors, Professors Tamarkin and Veblen, 
have tendered their resignations as have 
also a number of associate editors and 
collaborators in other countries. 


—American Mathematical Monthly 
46 (1939), p. 57 
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PROBLEMS AND SOLUTIONS 


Edited by: 
Richard T. Bumby, Fred Kochman and Douglas B. West 


Proposed problems should be sent to the MONTHLY PROBLEMS address given on 
the inside front cover. Please include solutions, relevant references, etc. Three copies 
are requested. 


Solutions of published problems should arrive before April 30, 1994 at the 
MONTHLY PROBLEMS address given on the inside front cover. Solutions should be 
typed with double spacing, including the problem number and the solver’s name and 
mailing address. Two copies suffice. A self-addressed postcard or label should be 
included if an acknowledgment is desired. 


An asterisk (* ) after the number of a problem, or part of a problem, indicates that 
no solution is currently available. Partial solutions will be useful in such cases. 
Otherwise, the published solution is likely to be based on a solution which is complete 
and correct. Of course, an elegant partial solution or a method leading to a more 
general result is always useful and welcome. In addition, references to other 
appearances of MONTHLY problems or to solutions of these problems in the 
literature are also solicited. 


PROBLEMS 


10338. Proposed by Charles Vanden Eynden, Illinois State University, Normal, IL. 


Given an integer n > 1, determine the set of integers which can be written as a 
sum of two integers relatively prime to n. 


10339. Proposed by Moshe Rosenfeld, Pacific Lutheran University, Tacoma, WA. 


Let A and B be complex matrices with AB* — B*A = B. Prove that B is 
nilpotent. 


10340. Proposed by Richard Bagby, New Mexico State University, Las Cruces, NM. 
For a normed linear space X and xe X, define 
P(x) = {y © X: |lx + yll* = IIxll? + llyll*}. 
If the norm in X comes from an inner product, then each P(x) is invariant under 


multiplication by real numbers. Is the converse true? 
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10341. Proposed by George Cain and Zhiging Lu (student), Georgia Institute of 
Technology, Atlanta, GA. 


Let D= {(x,y): x? +y*% <1} be the unit disk in the plane, and let 
{A,, A,...,A,} be a pairwise disjoint collection of finite subsets of the set 
C = {(x, y): x? + y? = 1}. Prove that there is a pairwise disjoint collection 
{K,, K,,...,K,} of connected subsets of D such that A; CK, for each i = 
1,2,...,n. 


10342. Proposed by Shmuel Rosset, Tel Aviv University, Ramat Aviv, Israel. 


Let F be a free group, and R a normal subgroup of F. Consider the subgroups 
[R, nF] defined by 


R ifn = 0, 
[R, nF] = tiene — 1)F], F] ifn > 0. 


Prove that the set of elements of finite order in R/[R, nF] is an abelian group. 


10343. Proposed by David M. Bloom, Brooklyn College, CUNY, Brooklyn, NY. 


Let us call a subset of Z semi-unfriendly (abbreviated S-U) if it contains no 
three consecutive integers. Let E,, denote the n element set {1,2,...,n}, and let 


A(n,k) = #{S CE,: #8 =k, S is S-U} 
B(n,k) = #{S CE,: #8 =k, Sis S-U and E,, — S is S-U}. 
Prove that 
B(3n — 1,n) = A(n + 3,3) 


for all n > 1. 


10344*. Proposed by E. Ehrhart, Université de Strasbourg, Strasbourg, France. 


Let “ be a regular tetrahedron, and let P € “. Define D,,(P) to be the sum 
of the distances from P to the vertices of .“, and D,(P) to be the sum of the 
distances from P to the edges of ~. Find the maximum and minimum values of 
D,(P)/D,(P). 


10345. Proposed by George Baloglou, SUNY College at Oswego, Oswego, NY, and 
Fred Galvin, University of Kansas, Lawrence, KS. 


Given a subset X C R one obtains a subset R?\ X? of the plane by removing 
those points both of whose coordinates are in X. If X # R, such a set always 
contains horizontal and vertical lines. 

(a) Find such a set X, of Lebesgue measure zero, for which R*\ X” contains no 
circles. 

(b)* Is there such a set X, of Lebesgue measure zero, for which every connected 
subset of R*\ X? consisting of more than one point contains a horizontal or 
vertical line segment? 
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NOTES 


Notes: (10339) An element, B, of a ring is called nilpotent if there is a positive 
integer k for which B* = 0. For the ring of n by n matrices over the complex 
numbers, for fixed n, it would be of interest to seek a complete characterization of 
the solutions of the equation of this problem. (10342) Here, the symbol [A, B] 
stands for the group generated by the commutators aba~'b~! with a © A and 
b € B. If A is a normal subgroup of B, so is [A, B]. A reference for free groups is 
Magnus, Karrass, & Solitar, Combinatorial Group Theory. (10344) This problem is 
listed as “unsolved” and no bounds are stated although partial results including 
conjectured extreme values are available, because these results are supported only 
by numerical evidence. 


SOLUTIONS 


Six Barycenters in Search of a Conic 


E3469 [1991, 955]. Proposed by Hiiseyin Demir, Middle East Technical University, 
Ankara, Turkey. 


Suppose P is a point in the interior of triangle ABC and suppose AP, BP, CP 
meet the lines BC,CA, AB respectively at the points D, LE, F. Prove that the 
centroids of the six triangles PBD, PDC, PCE, PEA, PAF, PFB lie on a conic if 
and only if P lies on at least one of the three medians of the triangle. 


Restatement of problem and fixing of notation. Applying the homothety with 
center P and ratio 3:2 we see that the centroids of triangles are on a conic if and 
only if the midpoints of AF, FB, BD, DC, CE and EA are on one conic. Let 
x,y, Z,u,v,w denote half the lengths of AF, FB, BD, DC, CE, EA, respectively. 
Let the midpoints of AF, FB, BD, DC,CE, EA be denoted by 1,2, 4,5, 6 respec- 
tively. 


Solution I by Victor Prasolov, Independent University of Moscow, Moscow, 
Russia. 

By Carnot’s Theorem (see Howard W. Eves, A survey of geometry (Revised 
Edition), Allyn and Bacon, 1972, pages 256 and 262) the six centroids lie on a conic 
if and only if 


x(2x + y)z(2z +.u)v(2v +w) =w(2w + v)u(2u+z)y(2y +x). (1) 
By Ceva’s Theorem, xzv = wuy, so (1) simplifies to xzw + zvy + vuxu — (wux + 


uyv + ywz) = 0, or (x — yz — uw — v) = 0. This condition corresponds to P 
lying on a median. 
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Solution II by Albert Nijenhuis, Seattle, WA. By Pascal’s Theorem, the points 1, 
2, 3, 4, 5, and 6 lie on a conic if and only if the three points OQ = AB 1 45, 
R= BC Nf 61 and S = C4 N 23 are collinear. (There is no real difficulty if any of 
these points are at infinity. The ratio AQ/QB, for example, is replaced by —1 if 
AB||45.) 

By Menelaus’ Theorem, we have 


OB u 2w+v ° RC w 2y+x 


SA y 2u+2Zz 


Multiplying these together and using Ceva’s theorem, as in Solution I, we see that 
AQ/QB : BR/RC - CS/SA = —1 if and only if (x — y)(z — u)(w — v) = 0. Thus 
Q,R,S are collinear and hence the points 1, 2, 3, 4,5, 6 lie on a conic if and only if 
P is on a median. 


Comments by Neela Lakshmanan, University of Scranton, Scranton, PA. The 
restriction that P is interior to the triangle may be relaxed: we need only that P 
does not lie on any side of the triangle. 

We can prove that the result is true not only for the midpoints but also for the 
points that divide each of those six segments in a constant ratio: If 1, 2,3, 4,5,6 are 
points on the sides of the triangle defined by A1:1F = F2:2B = B3:3D = 
D4:4C = C5:5E = E6:6A, then the six points lie on a conic if and only if P is 
on a median. Also, if P is an interior point, the hexagon 1, 2,3, 4,5, 6 is convex and 
attains its maximum area when P is the centroid of AABC. — 


Editorial comment. Many of the solvers supplemented the use of Carnot’s 
Theorem or Pascal’s Theorem with homogeneous coordinates and analytic meth- 
ods. Some others worked directly with conditions on the six coefficients of a 
general conic. 


Solved also by F. Bellot and M. A. Lopéz (Spain), R. J. Chapman (U.K.), J. Fukuta (Japan), H. 
Kappus (Switzerland), O. P. Lossers (The Netherlands), I. A. Sakmar (Turkey), Anchorage Math 
Solutions Group, and the proposer. One incorrect solution was received. 


Periodicity of a Sign Function 


E 3471 [1991, 955]. Proposed by William Calbeck, Florida International University, 
Miami, FL, and Bruce Reznick, University of Illinois, Urbana, IL. 


Let P, be the set of all integer-valued polynomials of degree at most k, i.e., the 
set of all polynomials p of degree at most k such that p(n) € Z for n &€ Z. (It is 
known that p € P, if and only if ° 


D(x) =a +a,(7| + a,(3 | +e +a,(7], 


where 4), 4,,45,...,a, are integers.) Let r(k) be the smallest power of 2 strictly 
greater than k. 

(a) If p © P,, show that the sequence {(—1)?}"_, is periodic with period 
r(k). 

(b) Show that any given sequence of plus and minus ones with period 2” occurs 
for some p in P5n_,. 
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Solution to part (a) by Robin J. Chapman, University of Exeter, UK. It suffices to 


m m+ 2° 


show that if j <2° and meé/Z, then j |= 


coefficients of x/ in the power series (1 + x)” and (1 +x)”*?, respectively. The 
congruence (1 +x)? =1+x” mod2 follows easily by induction on s. Hence 
(1 +x)"™+? =(1+x)"(1 + x?’) mod 2. Since j is less than 2°, it is immediate that 
the coefficients of x/ in (1 +x)” and (1 +x)”*? have the same parity. 


] mod 2. These are the 


Solution to part (b) by Albert Nijenhuis, Seattle, WA. Let do,...,@._, be 
arbitrary integers, and let by,..., b5»_, be the solution to the equations 


27-1 7. 
y 67] =a, for0<j <2"-1. 
i 


J 


The matrix of this system is lower triangular, with 1’s on the main diagonal, so {b,} 
are integers. The polynomial vr? 51 (* ] realizes the sequence {(—1)%}?"9' and its 


extension with period 2”. 


Editorial comment. Solvers used various methods; several cited theorems of 
Kummer and Lucas. William F. Trench recognized the problem as E 1365 [1959, 
312; 1959, 919], proposed by M. E. Hausner and solved by N. J. Fine, and noted 
that a generalization to modulus m appears in William F..Trench, ‘‘On periodici- 
ties of certain sequences of residues,” this MONTHLY 67 (1960), 652-656. Problem 
E 1365 had only two solvers in 1959. 


Solved also by D. Callan, M. Dindos (Slovakia), F. J. Flanigan, R. High, K. S. Kedlaya (student), O. 
P. Lossers (The Netherlands), R. Martin (student), M. D. Meyerson, A. Pedersen (Denmark), W. F. 
Trench, Anchorage Math Solutions Group, National Security Agency Problems Group, and the 
proposer. 


Arbitrarily Periodic Sequences 


10184 [1992, 60]. Proposed by Gerry Myerson, Macquarie University, New South 
Wales, Australia. 


Is there a sequence of natural numbers having the following two properties: 
(i) The sequence is periodic modulo m for every positive integer m, 
(ii) each natural number appears in the sequence infinitely often? 


Solution I by Kiran S. Kedlaya (student), Harvard University, Cambridge, MA. 
Yes. The following algorithm constructs such a sequence. Let S$, be the sequence 
whose one term is 1. For n > 1, recursively define S$, ,, by appending to the end 
of S, the sequences S$, +0-n!,S,+1-n!,...,58, +2-°n!, where T + k denotes 
the finite sequence obtained by adding k to each term of T. Let S be the 
sequence generated by this procedure as n > ©. 

To prove that S satisfies (i), note that for all n > m, we obtain S, by appending 
blocks that are congruent to S,, modulo m. Hence S is periodic modulo m with 
period dividing the length of S,,, which is seen by induction to be (m + 1)!/2. 

To verify that S satisfies (ii), first note by induction that S, contains {1,..., m!}. 
Then observe that every number in S,, occurs at least twice as often in S,,,,. Thus 
every natural number appears in S infinitely often. 
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Solution II by Richard Stong, University of California, Los Angeles, CA. For each 
n > 0, there are unique integers c,,...,c, with 0 <c, <j such that n = Li_,c,j!. 
Let g(x) = max{0, x — 1} and define a, = Li} g(c;,,)J!. 

Condition (i) holds for {a,,} because @,, , m+ = @, mod m!. To verify condition 
(ii), note that if n = L¥_,c,j!, then a, =n for any r of the form r = Lj_,b,j! with 
s >k and 


—_ c,~t+1 if2<j<k+1 
J 0 or 1 otherwise 


Editorial comment. Richard Stong’s solution was the only submission giving an 
explicit formula for the nth term of the sequence; most solvers gave recursive 
procedures. Solvers disagreed on whether the natural numbers include 0. For this 
problem the question is moot, as the periodicity is unaffected by adjusting each 
term by 1. The proposer observed that “‘natural numbers” can be replaced by 
“integers” by alternating the terms of S with the terms of 1 — S. 


Solved also by M. Dasef & S. Kautz, P. Flor (Austria), J. Gonzalez-Meneses (student, Spain), J. W. 
Grossman, T. Hesterberg, R. High, N. Kang (student, Korea), U. Klein (student, Germany), O. P. 
Lossers (The Netherlands), M. D. Meyerson, A. Nijenhuis, I. Praton, A. Riese, R. M. Robinson, T. W. 
Starbird, D. M. Wells, GCHQ Problem Solving Group (U.K.), Theory First, University of South 
Alabama Problem Group, and the proposer. 


A Golden Oldie 
10193 [1992, 161]. Proposed by Solomon Golomb, University of Southern California, 
Los Angeles, CA. 


Determine all.pairs of integers n, k such that 


Ce) = (2 1). n>k>1. 


Solution by Christos Athanasiadis (student), Massachusetts Institute of Technol- 
ogy, Cambridge, MA. All such pairs are given by n = F,,,.,F,,, — 1k = Fy,Fom_1 
for m = 2,3,... . Here (F,,,)m-=1 is the Fibonacci sequence defined by F, = F, = 1 
and Fin +2 —=Lm+1 + Fa 

To see this, first note that the given condition can be written as 


(n+ 1)k=(n-—k+1)(n-—k +2) 
or as 
(p+k)k =p(p +1), (1) 


where p=n-—k+1. Let p=rt, k =st with (r,s) =1. Then (1) becomes 
(r + s)st? = rt(rt + 1). It follows that t divides r, so that r = tr, and (r+s)s = 
r,(rt + 1). Since r, is relatively prime to s and hence also to r + s, it must be that 
r, = 1. Hence p=t*, k =st, and t?+1=s(t+5). We need the following 
lemma. ' 


Lemma. The integer solutions to 
t?+1=s(t+s), s=l1,tz1 (2) 


are given bys = F,,,_,,t = F,,, form =1,2,.... 
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Proof: The classical formula F,,,41Fom—1 — Fe, = 1 (easily proved by induction) 


= 
shows that s = F,,,_,, t = F,,, is a solution of (2) for m = 1,2,... . (in particular 
(1, 1) is a solution.) We now use an argument by descent to show that there are no 
other solutions of (2). Suppose that s and ¢ are positive integers satisfying (2) and 


that (s, ¢) # (1,1). Put 


fio (a a)fs} ems (Je (Tall @ 


Since ¢7+1=st+s*>1+57, we have s <t. Since st +s? >t’, we have 
(s/t) + (s/t)? > 1 and hence s/t > (V5 — 1)/2> 4. Thus t/2 <s <t, which 
implies that u = 2s — t and v = t — s are positive integers. It is easy to verify that 
ve+1—u(vut+u) =t*+1—s(t +5), so that (u,v) is a solution of (2) with 
0O<u<s,0<0v <t. 

It follows by repetition of this argument that 


t)_ (2 1\" (1 

) 1 1 1 
for some positive integer m greater than 1. A simple induction argument shows 
that 


; y en Pym —2 


11 | (m = 2,3,...). 


Fom—2 Fom—3 


Hence, if (s, ¢) is any solution of (2) other than (1, 1), we have 


t Fym—1 Fym—2 1 Fam 
s Fym—2 FF om—3}\1 Pom-1 


for some positive integer m greater than 1. Thus the lemma is proved. 
In view of the lemma we have k = st = F,,, F,,,_, and 


m-1 2m 


for some integer m greater than 1, as claimed. The first five solutions (n, k) are 
(14, 6), (103, 40), (713, 273), (4894, 1870), and (63551, 12816). 


Editorial comment. David M. Bloom and Savely Khosid each pointed out that 
the same problem appeared in the MONTHLY over sixty years ago as Problem 3459 
[1930, 508; 1931, 551]. The above solution is more concise and direct than the 
solution published in 1931 (which used the theory of simple continued fractions). 
Problem 3459 is also the 65th problem in the collection of MONTHLY problems 
published as [1]. 

The problem is also treated in [3], [4], [5], and [6] (particularly pp. 32-34). These 
previous occurrences were called to our attention by B. M. M. de Weger, by 
Jean-Marie Pages and Dave Trautman, by Robert B. McNeill, and by Mark Sand 
respectively. | 

The diophantine equation t* + 1 = s(t +s) of the above lemma may be written 
as (2s + t)* — 5t* = 4, an instance of the so-called Pell equation. (See, for 
example, Chapter 7 of Part One of [2].) Most solvers used the theory of the Pell 
equation or the theory of simple continued fractions. The selected solution 
bypasses the general theory, but uses knowledge of the small solutions of equation 
(2) to construct the change of variables in (3). 
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About one-third of the solvers obtained the result in the form n = F,,, Fy,,41 7 
1,k = F,,,Fo,—-1, ™ > 1 given in the above solution. Several solvers included lists 
of pairs (n, k) produced by this formula. Some solvers, as well as reference [4], also 


gave the 29 digits of (103). No one attempted to display the next value. 
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Solved by 88 readers and the proposer. Six incorrect solutions were also received. 
Similar Orthic Triangles 


10202 [1992, 265]. Proposed by Juan Bosco Romero Marquez, Universidad de 
Valladolid, Valladolid, Spain. 


Let A’, B’,C’ be the feet of the altitudes of AABC and let X,Y,Z be the 
centers of the circumscribing rectangles of AABC with edges BC, CA, AB respec- 
tively. Prove that AXYZ is a dilation of AA’B’'C’. . 


Solution I by Robin J. Chapman, University of Exeter, Exeter, U. K. There is an 
ambiguity as to what is meant by “circumscribing rectangle.” The circumscribing 
rectangle of AABC with edge BC may be defined as either: 

(i) the rectangle BCPQ where A lies on the line PQ (possibly extended); or 

(ii) the smallest rectangle containing AABC, one of whose sides lies on the line 
BC. 

These two definitions coincide provided neither 2 ABC nor Z ACB is obtuse. 
The result is always true under interpretation (i), but false under interpretation (ii) 
whenever AABC has an obtuse angle. In particular, if 2 ABC is obtuse, then both 
X and Z coincide with the midpoint of AC, but AA’B’C’ is not degenerate. 

We adopt definition (i) and use vector methods. Choose the origin O to be the 
centroid of AABC. Let a,b,c,a,b’,c’,x,y,z be the position vectors of 
A, B,C, S, A’, B’,C’, X,Y, Z respectively. The circumscribing rectangle of AABC 
with edge BC has vertices B,C and the points with position vectors b + (a — a’) 
and c + (a — a’). Hence x = (a+ b+c-—a’)/2 = —a’/2 as O is the centroid of 
AABC. Similarly y = —b’/2 and z = —c'/2. Hence AXYZ is obtained from 
AA'B'C' by a dilation of factor —1/2 centered at the centroid O of AABC. 


Solution II by Shailesh Shirali, Rishi Valley School, Chittoor District, Andhra 
Pradesh, India. Let ADEF be the medial triangle of AABC with vertex D 
opposite vertex A. Then it is easy to see that AXYZ is just the orthic triangle of 
ADEF with vertex X opposite vertex D. Now, a dilation about the centroid G of 
AABC with scale factor —1/2 sends AABC to ADEF and therefore sends the 
orthic triangle of AABC, namely AA’B'C’, to that of ADEF, namely AXYZ. 
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Editorial comment. Solution II, like other solutions employing constructions of 
classical geometry, was accompanied by a drawing. Jordi Dou submitted such a 
diagram entitled “Proof without words.” His diagram also highlights the fact, also 
observed by other solvers, that the dilation sending AXYZ to AA’B’C’ also sends 
the circumcenter of AABC (which is the orthocenter of the medial triangle) to its 
orthocenter, thereby exhibiting the fact that the centroid divides the segment 
joining the circumcenter and the orthocenter in the ratio of 1:2 (the property of 
the Euler line). 

Jiro Fukuta proved the following more general result. Let A’, B’,C’ be any 
points on the sides BC,CA, AB, respectively. Let X be the center of the 
circumscribing parallelogram with one edge BC and the other pair of edges 
parallel to AA’, and similarly for Y and Z. Then AXYZ is a dilation of AA’B’C’, 
centered at the centroid of AABC, in the ratio of —1/2. This can be proved in the 
same way as the original problem, using either synthetic or vector methods. Since 
the lines AA’, BB’, and CC’ are not required to be concurrent, this is more 
general than the affine version of the stated problem. 

This generalization can be easily carried over to higher dimensions in the 
following manner. Let A,A,...A,, be a simplex in Euclidean n-space. For each 
1 = 0,1,...n, let A’, be any point in the facet opposite A,, and let X; be such that 
the vector X; — G; is equal to the vector (A; — A’,)/n, where G, is the centroid of 
the facet. Then X,X,...X,, is a dilation of A, A’... A’,, centered at the centroid 
of the given simplex, in the ratio —1/n. 


Solved also by E. Alkan (student, Turkey), P. J. Anderson (Canada), J. Anglesio (France), F. Bellot 
and M. A. Lopéz (Spain), P.-C. Chuang, A. Coffman, I. Dimitric, J. Dou (Spain), J. Fukuta (Japan), H. 
W. Guggenheimer, J. G. Heuver (Canada), H. Kappus (Switzerland),.I. Kastanas, K. S. Kedlaya 
(student), N. Komanda, O. P. Lossers (The Netherlands), M. Lucian, H. M. Marston, R. Merrill, K. 
Perera (student), W. Reyes (Chile), B. Shawyer (Canada), A. Subramanian (student, India), T. C. Tran, 
M. Vowe (Switzerland), R. L. Young, and the University of Wyoming Problem Circle. The original 
proposal presented only a special case of the published problem. 


Matrices with Agreeable Adjoints 


10205 [1992, 266]. Proposed by Richard Sinkhorn, University of Houston, Houston, 
TX. 


In elementary linear algebra, two different definitions of the word ‘‘adjoint’”’ are 
used. The adjoint of a square matrix A with complex entries is either: 

(I) the matrix whose (i, j)-entry is the cofactor of a,; in A; or, 

(II) the complex conjugate of the transpose of A. 
Under what conditions on the matrix A will these two definitions yield the same 
matrix? 


Solution by Peter Nylen, Tin-Yau Tam, and Frank Uhlig, Auburn University, 
Auburn, AL. There are three possibilities: (i) A is a zero matrix; (ii) A is unitary 
with det A = 1; or Giii)'A is a 2 by 2 matrix of the form 


(2) 


We use adj A for the first “adjoint” and A* for the second. The above matrices 
all satisfy A* = adj A. Notice that (adj A)A = A(adj A) = (det A)J. If A* = 
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adj A, then 
A*A = AA* = (det A)I. (1) 


Since AA”® is positive semi-definite, det A > 0. By taking the trace of both sides of 
(1), it follows that A is either nonsingular or zero. If A is nonsingular, take the 
determinant of both sides of (1). Then |det A|* = (det A)”. Hence, if n ¥ 2, 
det A = 1, and consequently, A~! = (det A)” (adj A) = A*, ie., A is unitary. 
For n = 2, direct comparison of the entries of A* and adj A gives the displayed 
form. 

A related question is discussed in E. E. Underwood, “Classification of complex 
matrices A, where A = adj A,” Current Trends in Matrix Theory, North-Holland, 
1987, pp. 405-410. Michael K. Kinyon suggested replacing (1) by the “differenti- 
ated” condition 


A+A* = tr( A). (2) 
A similar method leads to the sequence of Lie algebras corresponding to the 
groups found above. 


Solved also by D. Callan, R. J. Chapman (U.K.), I. Dimitric, W. T. Gan (student, U.K.), N.-G. Kang 
(student, Korea), M. K. Kinyon, N. Komanda, C. Lanski, F. Schmidt, R. Stong, E. T. Wong, University 
of Wyoming Problem Circle, and the proposer. Six incomplete solutions were also received. 


Summing a Series of Volumes 


10207 [1992, 266]. Proposed by Eric Freden (student), Brigham Young University, 
Provo, UT. 


Find a closed form for L*_,) Vol(B”) where B” is the unit ball in R” (and 
Vol(B°) is taken to’ be 1). 


Composite solution by several solvers. More generally, if we take B” as the ball of 
radius r in n dimensional space, then the series converges for all r > 0 and 


2 ryr 2 
1+ —_— —~ dt}. 
Ted, “ | 


This is proved in detail in D. J. Smith and M. K. Vamanamurthy, ‘‘How small is 
the unit ball?”, Math. Magazine 62 (1989), 101-107. 


y Vol(B") =e” 
n=0 


Editorial comment. Most solvers indicated a reference both for Vol(B") and the 
value of the resulting series. The terms of even dimension clearly determine an 
exponential function. The series consisting of the terms of odd degree can be 
recognized in terms of the solution of the initial value problem: f’(x) = 1 + xf(x), 
f(0) = 0. Fourteen different references were given, none by more than three 
solvers. 


Solved by K. F. Andersen (Canada), J. Anglesio (France), S.-J. Bang (Korea), W. H. Beckmann, 
D. M. Bloom, D. Callan, R. J. Chapman (U.K.), J. I. Concha (Chile), T. Dali and S. Smith and M. 
Carlton and P. Bracken, M. Dindos (Slovakia), M. Dresevi¢é and N. Caki¢ (Yugoslavia), M. Fichter 
(Germany), C. Georghiou (Greece), C. P. Grant, N.-G. Kang (student, Korea), M. K. Kinyon, N. 
Komanda, I. I. Kotlarski, R. Kreczner, O. P. Lossers (The Netherlands), S. Matz, A. Pedersen 
(Denmark), K. Perera (student), F. C. Rembis, R. M. Robinson, P. Sawyer (Canada), B. D. Sterba- 
Boatwright, R. S. Tiberio, A. Tissier (France), D. B. Tyler, D. C. Vella, M. Vowe (Switzerland), D. M. 
Wells, P. J. Zweir, National Security Agency Problems Group, Shreveport Problem Solving Group 
(LSU), University of Wyoming Problem Circle, and the proposer. One incorrect solution was received. 
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Collaborating editors: David F. Appleyard, Paul T. Bateman, Bruce C. Berndt, 
Duane M. Broline, Barry W. Brunson, Frank S. Cater, Gulbank D. Chakerian, 
Underwood Dudley, Gerald A. Edgar, Michael A. Filaseta, Ira M. Gessel, Richard 
A. Gibbs, Jerrold R. Griggs, Douglas A. Hensley, John R. Isbell, Mourad E. H. 
Ismail, Murray Klamkin, Daniel J. Kleitman, Frederick W. Luttmann, Frank B. 
Miles, Richard Pfiefer, Stephen L. Portnoy, J. O. Shallit, John Henry Steelman, 
Kenneth B. Stolarsky, David E. Tepper, Douglas B. Tyler, Daniel Ullman, and 
William E. Watkins. 
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Answer to Picture Puzzle 
(p. 847) 


Louis de Branges, the solver of the 
Bieberbach conjecture. 


During the last quarter of a century 
there has been a universal effort to im- 
prove the quality of teaching in the ele- 
mentary and secondary schools. Whenever 


_a change is made in this country in the 


curricula for the training of teachers, it 
has been in the direction of more “educa- 
tion”, pedagogy and psychology, always at 
the expense of further courses in subject 
matter. The results are already apparent; 
for grade schools the new method is an 
improvement, but for high schools, espe- 
cially the last two years, it is lamentably 
deficient. However desirable the other 
things may be in themselves, for a teacher 
of mathematics nothing has yet been dis- 
covered to replace a knowledge of mathe- 
matics. May the present volume take its 
place in American and English schools, to 
extend the service it has so admirably 
rendered in Germany. 


VIRGIL SNYDER 


—American Mathematical Monthly 
40 (1993), p. 171 
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Thomas Archer Hirst— 
Mathematician Xtravagant 
VI. Years of Decline 


J. Helen Gardner and Robin J. Wilson 


I have had several letters during the week from Cayley on Geometrical Transformation. I wish I 
were at liberty to do my part in the important investigations that are now ripe; but I have to 
exercise self-denial. My lectures absorb my time and constitute my duty. Sylvester again is 
actively thinking and producing, and Chasles has just published a most important extension of 
his method. I must simply look on. 


By 1865, Thomas Hirst was at the height of his powers. As Professor of Mathemat- 
ical Physics at University College, London, Vice-President of the newly-formed 
London Mathematical Society, a member of the distinguished X-club, and a 
Council member of the Royal Society, he was in a position to influence those 
around him. A long-standing ambition was to propose the French geometer Michel 
Chasles for the Royal Society’s Copley medal. 


29th October 1865: ...my proposition (although late) was well received; it was unanimously 
agreed that his name should be put on the list. The adjudication is on Thursday next, and I shall 
work hard to carry him. He has formidable rivals however in Regnault, Plucker, and Poncelet. 


And he was successful, although the ailing and elderly Chasles was too unwell to 
come to London for the ceremony. At the celebration dinner afterwards, a toast 
was proposed to Chasles, the Copley Medallist and ‘his friend Dr Hirst’. Following 
this toast, Hirst made a speech describing Chasles’s achievements, which he 
included in full in his diary entry. He was obviously very pleased with his success, 
and now looked forward to presenting Chasles with his prize. 


30th November 1865: ...I have but one step more to take and that will be across the channel to 
the Passage St. Marie, Rue de Bac at Paris, there with my own hands I will place the medal in 
the hands of Chasles, as a grateful offering to the man who, next to Steiner, has been most 
influential in determining my own career. 


24th December 1865: ... My first act this morning was to call on Chasles and deliver the Copley 
Medal. It was manifestly a welcome present to him... 


Throughout 1866, Hirst added further to his list of personal achievements. In 
February, he was elected a Member of the Athenaeum Club, in June he was 
appointed General Secretary of the British Association for the Advancement of 
Science, an onerous post which he held for four years, and in November he was 
admitted a Fellow of the Royal Astronomical Society. While attending a rather 
uninteresting meeting of one of his clubs, he ‘made a calculation to show that 
there would be ample standing room for all the inhabitants of the Globe in the Isle 
of Wight’—a result which greatly surprised him. 
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Meanwhile, the London Mathematical Society was quickly becoming estab- 
lished. At one meeting, the President, Augustus De Morgan, ‘called attention to 
the novelty and importance of many of the papers, and remarked that this was the 
only society in England where such papers could be received’. It seems that the 
Society was keen to encourage young talent: 


22nd November 1866: At Math. Society. Clifford of Trin. Coll. Cambridge made his first 
appearance and gave us a very good paper ‘on Harmonics’. There is no young mathematician of 
greater promise than Clifford just now. 


University College, London. 
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In 1867 Augustus De Morgan had a disagreement with University College, and 
resigned his Chair in Mathematics. Hirst was elected in his place ‘unconditionally 
and most unanimously’. He proved to be a first-class choice, if the memory of one 
of his students is accurate. 


Thomas Hirst in 1866. 


‘His presence in the classroom was striking. 
He was tall, and held himself erect with an 
almost military air. He had a long black 
beard and a great, bald, dome-like fore- 
head. He was a man with whom it was 
impossible to imagine the most audacious 
student venturing to take a liberty. There 
was something about him that invested his 
unlovely subject with dignity, if not inter- 
est. Less, perhaps, than any of the other 
professors, did he seem to think of exami- 
nations. To him, I believe, incredible as it 
sounds, mathematics must have been a 
solemn, high pursuit: a passion, if not a 
religion. Yet with all his aloofness of man- 
ner he could be very simple, very patient, 
and extremely kind. Certainly to one of his 
most hopeless pupils he showed himself all 
three.’ 
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Meanwhile, the X-club continued its tradition of monthly meetings, with the 
occasional distinguished guest in attendance. 


3rd March 1868: At the X-Club. Darwin was our guest. I was in the chair, and again the evening 
passed very pleasantly away. 


2nd April 1868: At the X. Huxley, Frankland, Sir J. Lubbock and myself were the only ones who 
dined. Spottiswoode was there for an hour and brought Clifford with him. Clifford is the Lion of 
this season. Everybody is anxious to entertain him. I hope only his head will remain unturned. 


But increasingly he found that his administrative and lecturing duties left him too 
little time for his researches, and he frequently complains of his inability to spend 
enough time on geometry. 


7th February 1869: At home writing paper on Degenerate Conics. This paper perplexes me 
sorely, I begin to fear that it will never be satisfactorily written until I can work at it 
uninterruptedly. My daily duties so absorb my thoughts that I can only in leisure hours succeed 
in turning them to this new work, and no sooner are they turned and effective work rendered 
possible than the said duties turn them away again. 

Tyndall, generous friend, proposed a remedy for this incessant disappointment I experience 
which I must record; it was so characteristic. ‘““Give up your Professorship and devote yourself 
for a few years to your work solely. I have more money than I want and I can easily spare you 
what you would require to enable you to work without embarrassment.” 

However dear to me the privilege of thus working I could not, of course, accept it on these 
easy terms. My first duty is to earn my bread by teaching; if original research is not compatible 
with the performance of this duty then I must sacrifice originality however dear to me it may be, 
or however much my science might be advanced thereby. If the mathematical world prefer my 
teaching to my researches what right have I to complain? Can I even say that its choice is a bad 
one? I doubt it. 


Tyndall realized that Hirst’s researches could lead to important discoveries. 
Indeed, had he managed to persuade Hirst to take up his offer, Hirst’s name might 
have been better remembered. As it was, the situation did not improve, and two 
days after his 39th birthday, Hirst wrote: 


24th April 1869: Working at quadtic transformation. Cayley and Clifford have begun to work at 
the subject and unless I communicate what I did in 1865 I shall be out-run. How I long to have 
leisure to pursue my work. So long as my present drudging continues I shall be scientifically 
speaking extinguished. 


Despite this, or perhaps because of a need for relaxation from the pressure he was 
under, Hirst made one of his regular visits to the Continent to meet old and new 
acquaintances: 


26th July 1869: Bath in Neckar. We-walked up to the Castle and saw all over it, the Fass 
included. Dined at Hotel Schrieder at 1. P.M. with Bunsen where we met Kirchhoff (on 
crutches) and Konigsberger the Mathematician and successor of Hesse, now at Munich. We took 
our Abendessen with Helmholtz... 


27th July 1869: After another bath in the Neckar I attended Konigsberger’s lecture on Theory of 
Determinants. He introduced me to a young Russian lady [Sonya Kowalevskaya]... who attends 
his lectures and is at home in Elliptic Functions. She belongs to the mathematically gifted family 
of Schuberts. She is pretty and exceedingly modest. 
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Back in England, Thomas Hirst’s teaching activities took a new direction: 


Ladies’ Educational Association, London. A Course of Twenty-four Lectures on the 
Elements of Geometry will be given by Professor Hirst, in the Minor Hall, St. George’s 
Hall, Langham Place, on Mondays and Fridays at 11. A.M. (beginning on January 17), 
should a sufficient number of tickets be applied for before Christmas. The Lectures will 
be of an elementary character requiring no previous knowledge of the subject, the extent 


to which it will ultimately be carried being dependent upon the progress of the class. 

Fee for the Course of 24 Lectures, £11.1.6; Governesses £1.1s. Ladies over seventeen 
years of age may join this or any other Course in connection with the Association (that of 
Chemistry subject to the approval of the lecturing professor) after Christmas on the above 
reduced terms. 


Thirty ladies enrolled for the first lecture, but about sixty attended. By the 
following week, fifty-seven students had enrolled, and a measure of Hirst’s excep- 
tional teaching skills may be gained in that half-way through the course he records 
that ‘one or two only have confessed inability to follow”’. 

After long deliberation, he made up his mind to resign his chair, and apply for a 
well-paid administrative job which he hoped would give him more time for his 
research. 


28th February 1870: ...The fact that I cannot at present do any original work, that it is only by 
devoting myself wholly to lecturing that I can keep up my number of students at the College and 
thus secure my bread; that as my strength fails my prospects will necessarily be worse at 
University College; these facts I say decided me at length to apply for an appointment of an 
inferior order, perhaps, but of a less arduous and more remunerative character. Moreover if I 
succeed I shall come in contact with good and influential men and myself be able to influence to 
some extent the character of Education in England. 


After some confusion, in March 1870 the Senate of the University of London 
appointed him Assistant Registrar and, for a time, his researches began to make 
progress again. He began work on a memoir on the “Correlation of two Planes’. 


31st December 1871: ...It grows under my hands both in bulk and, I think, in value. Small as is 
my year’s achievement, it has given to my life a purpose for which I feel grateful. It has raised my 
life in my own estimation,—and it is almost the only thing that has done so—above mere routine 
and mediocrity. To keep my brain clear and in a condition to discover geometrical relations has 
become to me a main purpose in life, all other objects have in comparison become of little 
moment to me. 


Hirst now found time to devote to a topic which had been dear to his heart for 
several years. Already by 1868 he had come to believe that Euclid’s Elements 
should be supplanted as the main geometry textbook in English schools, and 
accordingly he had spent some time editing a new geometry book by Richard 
Wright. This conviction, arising from his years as a surveyor and his experience of 
teaching practical geometry at Queenwood and University College School, left him 
well placed to help establish a new association whose aim was to reform the 
teaching of geometry in schools. This was the Association for the Improvement of 
Geometrical Teaching, which was founded in January 1871; Hirst was its first 
president, and held office for seven years. Later, in the 1880s, it broadened its 
scope to cover the whole range of school mathematics, and in 1897 it was re-named 
the Mathematical Association, a name which it holds to this day. 

In 1872, Hirst was elected President of the London Mathematical Society for a 
period of two years. Sylvester had suggested Cayley for this post, and Hirst was 
also proposed, despite wanting to remain Treasurer. 
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The Royal Naval College in Greenwich. 


10th October 1872: ...At the first vote Cayley stood first, I next and Henrici last but none 
obtained an absolute majority of votes. Henrici’s name was accordingly withdrawn and the voting 
resumed when I obtained one more vote than Cayley. I voted for Cayley both times... Had 
Spottiswoode not strongly urged my accepting the office of President and had it been any other 
than Sylvester who divided the Council between Cayley and myself I should have persisted in 
declining to serve in any other capacity than that of Treasurer. Sylvester’s animus against me 
was disagreeably manifest. It has lasted now for years and the cause of it is just as unknown to 
me as it was on its first appearance. ... 


In the following year, Hirst embarked on his fourth (and final) career. He was 
appointed the first Director of Studies at the Royal Naval College in Greenwich, 
with a salary of £1200 per year, plus a house. This position enabled him to keep in 
touch with the international mathematical community. 


3rd October 1873: Tchebichef, who called on me a few days ago, and Klein dined with me at 
Greenwich. Tchebichef told us of a mode of converting circular into rectilineal motion (a propos 
of the parallelogram of Watt) which was a simple and beautiful application of Quadric Inversion. 


For some years there had been a lack of contact between Hirst and Sylvester. In 
1875, on learning that the latter was suffering from rheumatism in the eyes, Hirst 
broke the long silence by expressing his sorrow at Sylvester’s affliction. 


25th May 1875: ...He voluntarily shook hands with me, and thus at last there is a kind of 
reconciliation between us. I am very glad of it, though I have learned to my sorrow that our 
former intimacy can never be renewed. What the exact cause of our original estrangement was I 
never knew, but I do know that he suspected me most unjustly of incessantly plotting to 
undermine his influence in the scientific and mathematical circles. He misconstrued every act 
and word of mine to such an extent that intercourse was impossible. 


In 1878, his work was recognized by the University of Cambridge: 


8th June 1878: I received the Diploma of Membership of the Cambridge Philosophical Society. I 
was gratified about a month ago to hear through Glaisher of my election. I may say that this is 
the first recognition I have ever received from any University in my native country. 
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The next year, he made yet another visit to the Continent. While in Paris, he met 
Liouville in the street. 


18th May 1879: ...A little shrivelled gouty old man he has become and very garrulous. It was 
with difficulty I broke away from him... 


More enjoyable was a visit to Parpan in Switzerland, a little Alpine village 4,545 
feet above the sea, where ‘I found Cremona, Casorati, Beltrami (with the Signora 
C), Geiser, Schlaffli, Frobenius and Meier (seven mathematicians!)’. 

Then, in late 1880, he learned that his unpublished researches had indeed been 
out-run, as he had forecast in 1869. 


11th November 1880: The first meeting of the Math. Soc. took place on Nov. 11th. Cayley came 
to it and stopped with me. We were speaking of Cantor’s paper on the cyclical self-correspond- 
ing points in two coincident planes between which a quadric relation exists. It has just appeared 
in the Annale di Matematica. I communicated precisely the same theorem to the British 
Association at Birmingham in 1865 but nothing was printed about it except the barest notice in 
the Proceedings. I showed Cayley my M.S. notes for that communication. He took them home 
with him and expressed an intention to write something about the matter. I shall be glad to be 
associated with a theorem which was always a pet of mine. As usual however I went on nursing 
my pet with the intention of allowing it to grow and develop itself more before I published it. 


In 1883 he heard from Thomas Huxley that the Royal Society had awarded him its 
prestigious Royal Medal, principally for his work on Cremona transformations: 


30th November 1883: ...I received my Royal Medal from Huxley who addressed to me a few 
friendly words in addition to the formal ones of presentation. “Although quite out of order” he 
said “I cannot refrain from expressing my sincere pleasure at being able, on the first occasion of 
my official representation of the Royal Society, to hand this Royal Medal to one of my oldest 
friends”’. 


The Royal Society—a portrait group of.some of the most distinguished Fellows in 1889. 


At the front are, from left to right, Sir Gabriel Stokes, Sir Joseph Hooker, James Joseph Sylvester, 
Thomas Huxley, Archibald Geikie, John Tyndall, Arthur Cayley, Sir Richard Owen, W. H. Flower, and 
William Crookes. 
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The last entries of Hirst’s diary, and Hirst’s grave in Highgate Cemetery, North London. 
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Hirst’s health had always been a cause of concern, and now it continued to decline 
as first kidney stones and then a stomach tumour were diagnosed. He found life 
very lonely when his brother John died and his favourite niece, who at one stage 
had been his housekeeper, married and then died in childbirth. He travelled to 
Greece and Egypt and in 1883 he gave up his Greenwich post at the age of 53. He 
now had the time to work on his geometry, at his clubs in London during the 
summer, and in France during the winter. He also featured in a popular book: 


11th January 1890: I gave some final touches today to the notice of myself and my work in “Men 
of the Time’. It will be posted tomorrow. 


Finally, in 1890, he finished his memoir on the correlation of two spaces. He had 
worked on it for a long time, and after its completion he destroyed his mathemati- 
cal notebooks. Suddenly he seemed old, spending his time in watching the rapidly 
changing world from his clubs, his flat, and the park: 


23rd August 1890: ... What a mad world it is! In the distance the Sunday Band was playing 
unmelodiously. What a noisy, jigging world it has become! 


He became increasingly depressed by the number of his colleagues and acquain- 
tances who were departing this world. 


19th February 1891: ... At the Athenaeum I read, in Nature, of the death of Madame Sophie 
Kovalevsky (aged 38), Professor of Mathematics at the Hégskola of Stockholm. When she was 18 
years of age I was introduced to her by Konigsberger at Heidelberg, whose lectures she was then 
attending. Some years afterwards she studied under Weierstrass at Berlin... As far as her 


1993] THOMAS ARCHER HIRST 913 


mathematical abilities were concerned, she appears to have been superior to any predecessor of 
her own sex. She died from an attack of pleurisy; brought on, it is believed, by a chill which 
succeeded her rapid journey home from the South of France in order to commence her lectures 
at Stockholm. 


For thirty-four years Anna had never been far from his thoughts and in September 
1891, he paid one of his regular visits to Paris to bid Anna good-bye for the last 
time. The turn of the year brought yet more sad news. 


7th January 1892: I hear from Sturm this morning that Heinrich SchrGter, of Breslau, is dead. 
He and I heard Steiner’s lectures together, at Berlin, in 1851-2... He has been taken before 
me. When will my time come?... 


It came sooner than he thought. London was hit by a flu epidemic, one of the 
worst of the century. His resistance lowered by years of illness, and now suffering 
from cancer of the prostate, Hirst quickly succumbed. His last diary entries were 
written just four weeks before his death. 


17th January 1892: ...the symptoms of violent cold in the head continued until nearly midnight. 
I then went to bed, but slept only in a disturbed fashion and awoke with pains and cramps all 
over my body. I fear the influenza has overtaken me. 


18th January 1892: I rose in a sad plight. I took coffee for breakfast, however. This set the 
bowels acting; but no relief from my oppressive malaise followed. Cranstone called to look at the 
fallen chimney-piece in my sitting room. 


On 16th February 1892 he died, and was buried in Highgate Cemetery. 


f 
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46. Proposed by H. C. WHITAKER, 

A. M., Ph.D., Professor of Mathemat- 

ics, Manual Training School, Philadel- 

phia, Pennsylvania. 

“There was an old woman tossed up in 

a basket 

Ninety times as high as the moon.” 
Mother Goose 


Neglecting the resistance of the air, 
how long did it take the old lady to go 
up? 


American Mathematical Monthly 
3, (1896) p. 281 
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Densest Packings of Congruent Circles 
in an Equilateral Triangle 


Hans (J. B. M.) Melissen 


1. INTRODUCTION. How large is the smallest square box that can contain n 
milk-bottles? If m points are distributed in a circle such that the distance between 
any two points is at least d, what is the largest possible value for d? Figure 1 shows 
why such problems are closely related. If K is a circular disc, or a polygonal region 
whose edges are all tangent to a circle, packing n equal circular discs of maximum 
diameter inside K is equivalent to finding m points in K such that the pairwise 
minimum distance between points is maximal. For instance, in a unilateral triangle, 
these points are the centers of circles of diameter d that pack into (1 + ¥3d)K. 
We will refer to d as the maximum separation distance of n points in K. As there 
seems to be little hope of solving the packing problem for all n, research has been 
focussed on asymptotic estimates and on the investigation of small values of n. 


Figure 1. Densest packing of seven circles in an equilateral triangle. Seven points in an equilateral 
triangle with largest possible minimum distance between the points. 


During the last decades much progress has been made for circle packings inside 
a number of simple geometrical shapes, such as the square and the circle. 
Solutions were found by trial and error or by computer aided optimization. 
Although near-optimal packings are easy to construct, few optimality proofs have 
appeared so far and many conjectures still rest unproven. An excellent review with 
relevant references can be found in [2]; see also [4, 11]. 

In 1969 Pirl [13] exhibited circle packings in a circle for mn = 2,...,20 and 
proved their optimality for n < 10. A proof for n = 11 was given recently by the 
author [8]. 
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Optimal circle packings in a square have been constructed for n = 6 by 
Graham, for n = 7 by Schaer (both unpublished), for n = 8 by Schaer and Meir 
[14] and for n = 9 by Schaer [15]. Wengerodt (and Kirchner) [18, 17, 19, 7] gave 
proofs for n = 14, 16,25 and n = 36. 

Another problem that comes to mind is the packing of equal circles into an 
equilateral triangle. Surprisingly, only the case of the triangular numbers n = 
k(k + 1)/2 has been tackled in the literature [12, 2]. In the vein of Pirl, Schaer 
and Wengerodt we will provide optimal arrangements for n < 10, nm = 12 and give 
an alternative proof for the triangular numbers. 

The closely related problem of partitioning an equilateral triangle into subre- 
gions such that the maximum of the diameters is minimal has been studied by 
Graham [6]. Optimal packings of 2, 3, 4, 5, 8, 9 and 10 equal spheres in a regular 
tetrahedron can be found in [1]. 


2. OPTIMAL PACKINGS IN AN EQUILATERAL TRIANGLE. Figures 2a—k and 
2p show arrangements of m points inside a unilateral triangle for which the 
minimum distance between the points is maximal. The solid lines in the figures 
connect those pairs of points for which.the distance is equal to the maximum 
separation distance d,. The values of d, are given in Table 1. For n= 
2, 3,..., 10, 12 we will prove that the arrangements shown are indeed optimal. The 
proofs for n = 2,...,7,10 consist in constructing a decomposition of the triangle 
into at most m — 1 subregions. Dirichlet’s pigeon-hole principle tells us that one of 
the subregions must contain at least two points. The maximum diameter of the 
subregions is then an upper bound for the minimum possible distance between two 
points of the arrangement. In the cases under consideration, this upper bound is 
attained by the given configuration. The optimality proof for the arrangements of 
eleven points is rather involved and will be the subject of a separate paper. The 
cases n = 2,3 are evident, so we will proceed with n = 4. 


2.1. Arrangement of Four, Five and Six Points. Two of the four points must lie in 
the same subregion from the partition shown in Figure 3a, so d, < 1/ V3. If the 
upper bound is attained, then one point must lie at the center and the other one is 
a vertex of the triangle. The only possible locations left for the other two points are 
then the other two vertices of the triangle, so for n = 4 the configuration is 
unique. 

Using the partition of the triangle into four triangles as in Figure 2e, it follows 
that the maximum separation distance for n = 5 and n = 6 is equal to 1/2. The 
configuration for n = 5 is just the arrangement for n = 6 from which one arbitrary 
point has been removed. This is the only freedom allowed in finding an optimal 
arrangement for n = 5. 


2.2. Arrangements of Seven Points. An interesting feature of n = 7 is that, apart 
from reflected configurations, there are two different types of optimal solutions as 
is illustrated in Figures 1 and 2f. One is symmetric and rigid. In the other one 
(dashed in Figure 2f), where the interior point on the left is moved to the base of 
the triangle, the position of the left point in the second row is no longer unique. By 
projecting on the base of the triangle it can be seen from the configuration in 
Figure 2f that its separation distance is equal to d, = (V3 — 1) /2. The partition 
shown in Figure 3b is based on the points that lie on the edges and on the bisectors 
of the triangle, and that are at distance d, from the vertices, together with the 
center of the triangle. From this partition it follows immediately that the maximum 
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(c) n= (d) n=5 


(ec) n=6 (f) n= 


(g) n=8 (h) n=9 


Figure 2. Optimal and conjectured optimal (*) arrangements of points in a unilateral triangle. The 
solid line segments are of length d,,. 
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(ii) n= 10 (j) n=11 


(k) n=12 (m) n= 13* 


(n) n= 14* (p) n= 15* 


(q) n=17* (r) n= 19* 


Figure 2. (Continued). 
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TABLE 1. Maximum separation distance d, of n points in a unilateral triangle 
max. separ. distance d,, max. separ. distance d,, 


1 = 1.000000... 12 2-73 =0.267949... 


1/y3 = 0.577350... 13* = 0.251813... 
1/2 = 0.500000... 14* 15 1/4 = 0.250000... 


(V3 —1)/2 = 0.366025... 17* (3 — ¥3)/6 = 0.211324... 

8 (733 —3)/8 = 0.343070... 19* = 0.200321... 
9, 10 1/3 = 0.333333... k(k+1)/2-1* 1/(k-1) 
11 @-—V6)/2 =0.275255...  kk+1)/2 1/(k - 1) 


*marks the conjectured values. 


(a) 


(b) 


Figure 3. Partitions for n = 4 and n = 7. The solid lines indicate to which subregion each edge 
belongs. The three dotted lines in (b) are of length d,. 


separation distance is equal to d,. The pentagonal regions can contain at most two 
points at distance d,, in exactly one way, whereas the quadrilateral regions can 
accommodate only one point. Easy combinatorial arguments show that only the 
configurations described above are possible. 


2.3. Arrangements of Eight Points. A straightforward computation shows that the 
separation distance for the configuration in Figure 2g satisfies an equation of 
degree four, leading to a separation distance of d, = (/33 — 3) /8. The arrange- 
ment is unique up to rotations. 

To prove that d, is optimal, suppose that we have a configuration for which the 
distance between any two points is at least d > dg. It is easy to see that a point 
that is closest to a vertex of the triangle can be moved to that vertex without 
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disturbing the optimality of the solution. We can therefore assume that the three 
vertices of the triangle are part of the configuration. This assumption will not 
restrict the total number of solutions to be found. 

We will make use of the decomposition in Figure 4a. The vertices in this 
partition can be found by using the points from the arrangement in Figure 2g and 
their rotated images, together with the center of the triangle. Consider the closed 
region formed by the union of the subregions R,, R,, R3, Q,, Q>, Q3. In this region 
five points must be accommodated at a mutual distance of at least d. All its 
subregions have a diameter of at most dg. As dg < d, each cannot hold more than 
one point of the solution. This means that two of the Q,, together with their 
interjacent .R-region must each contain one point of the solution, for instance 
Q,,R,,Q,. This cannot happen, because |A,D| = |B,D| =d,. Here D is the 
midpoint of A,B, (divide Q, UR, UQ, with a cut along DC and apply the 
pigeon-hole principle). 


Figure 4. Partitions for n = 8 and n = 12. 


Now we shall determine all possible configurations for which the separation 
distance is equal to d,. Each R; (the closure of R p can contain at most two points 
of the solution. For instance, for R,, the possible ‘combinations would be A, — B, 
and A, — B,. First, we show that no Q, can contain a point of the solution i in its 
interior. 

1. If two of the Q-regions, for instance Q, and Q,, have a point of the solution 
in their interior, then R, cannot contain a point. Furthermore no point can be in 
the interior of Q,, otherwise there could be no solution points in R, and R3. This 
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implies that the union of R, and R, must contain at least three points of the 
solution, so one must contain two points. This is impossible, because one of these 
points (A,, A,, B,; or B,) would then be too close to at least one of the solution 
points in Q, or Q,. 

2. If only Q, has a solution point in its interior, then R, and R, will not contain 
more than one point each, so there must be two points in R,. This cannot occur 
because one of these points would be too close to the points of the configuration in 
R, and R,. 

The five points must therefore be distributed over R,, R,, R3. As the center of 
the triangle cannot be part of the solution, one of the three regions (e.g. R;) 
contains only one point. This implies the solution given in Figure 2g. The 
arrangement is unique up to rotations. 


2.4. Arrangements of Nine and Ten Points. The unique configuration for n = 10 is 
an easy consequence of the pigeon-hole principle, applied to the obvious subdivi- 
sion into triangles (see Figure 2i). The configurations for n = 9 can be obtained by 
removing one arbitrary point from the arrangement for n = 10. Unfortunately the 
pigeon-hole principle cannot be applied, because a partition into eight regions 


must contain a subregion of diameter 2/(1 + V3 + ¥6v3 ) > 1/3 (6). 

First, we shall demonstrate that the maximum separation distance for n = 9 is 
equal to 1/3. Suppose that for some configuration the distance between the points 
is at least 1/3 + e, where e > 0. This means that there must be exactly one point 
in each subregion in Figure 2i. The three points in the three outermost triangles 
prohibit other points from coming within a distance e from the edges of these 
triangles. Consequently, the region inside the hexagon where the remaining six 
points should be situated is actually contained in a disc of radius ry 


= V1 — 3e + 9e” /3 < 1/3. According to Pirl [13, §2], the separation distance of 
these points cannot exceed r,, which contradicts the assumption that the distance 
exceeds 1/3. 

Having established d, it is not difficult to see that all optimal arrangements for 
n = 9 can be obtained by removing one arbitrary point from the arrangement for 
n = 10. This follows from the fact that the circumscribed circle of the six inner- 
most triangles can enclose at most seven points with a mutual distance of at least 
1/3. On the other hand, the three regions outside this circle can contain at most 
three points in all, so there must be at least six points in the circular disc. From the 
configurations for the circle found by Pirl it follows that only the vertices of the 
small triangles can be part of the configuration. 


2.5. Arrangements of Twelve Points. The unique optimal configuration for n = 12 
is shown in Figure 2k. Consider the partition as indicated by the solid lines in 
Figure 4b. The coordinates of the nodes can be found in Table 2. The subdivision 
is symmetric in the bisector through A,. The triangle is now divided into twelve 
regions whose diameter is at most d,, = 2 — V3. If the maximum separation 
distance of an arrangement were larger than d,,, then there would be exactly one 
point in each subregion. The presence of a solution point in A,,A,.A,,. subse- 
quently implies that there is a point in the interior of A,)A,,A,7,, 444541, Ao; 
B,A;A,¢A, and of A,A,B,, so there can be no point in A, A, A. This contradic- 
tion implies that d,, is the maximum separation distance. 

Next, we will find the unique arrangement corresponding to this maximum 
separation distance. Arguments similar to those already discussed show that there 
can be no point in A,.A,.A,, (with the possible exception of A,,). By symmetry 
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TABLE 2. Coordinates of the nodes in the partitions for n = 12 


the same must be true for A,A,B, and A,A,B,. Now we adapt the decomposi- 
tion of the triangle to one with the three bisectors of the triangle as axes of 
symmetry. This is indicated by the dashed line segments in Figure 4b. The region 
A,As5A,,A,,AjgAj¢6 With the segments A;A,, and A,,A,, excluded can contain a 
maximum of four points of the optimal arrangement, and this in exactly one way 
(A,, Ay, Ais, Aig). This is evident from the partition into three subregions of 
diameter d,,. The central hexagon can contain a maximum of three points 
(A,, A,, A,,). Straightforward combinatorial arguments then show that three 
solution points in the hexagonal region correspond to the arrangement in Figure 
2k, whereas two or less points cannot lead to a solution. 


2.6. Arrangements for Triangular Numbers. For the triangular numbers n = k(k 
+ 1)/2, (k = 2), the obvious candidates for the optimal arrangements are given by 
the regular triangular lattice arrangement in analogy to Figures 2b, e, 1, p. 
Unfortunately, the partitioning trick is unsuitable to prove this for all triangular 
numbers. This is because the number of triangles (k — 1)? exceeds the number of 
points n, for k > 5. Oler [12] asserted that the minimum distance between n + 1 
points in a unilateral triangle is smaller than 1/(k — 1). Looking at his proof we 
notice that Oler actually proved that d, = 1/(k — 1), however, without showing 
that the obvious arrangement is indeed unique. The proof is based on a general 
inequality that was conjectured by Zassenhaus and proved by Oler in 1961 (see 
[5]). This inequality provides an upper bound for the number of points n that can 
be placed in a planar convex compact set K at a mutual distance of at least 1, 
expressed in terms of the area w(K) and the perimeter w(0K) of K: 


2 1 
n< a MA) + 7 u(IK) + 1. (1) 


The optimality proof for the triangular numbers is obtained by applying this 
inequality to an equilateral triangle. A similar inequality of Groemer (1960, see 
[10]) can also be used. We shall give a more straightforward proof by deriving this 
inequality directly for the case of a unilateral triangle. In addition we can also 
conclude the uniqueness of the optimal solution. 


Theorem. Jf n > 2 points are placed inside a unilateral triangle then the minimum 
of the mutual distances between these points, d, satisfies the following inequality: 


2 
d < ————_. 2 
v8n+1—-3 (2) 


Equality is attained only if n= k(k + 1)/2, (k = 2), for points on a regular 
triangular lattice. 
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Proof: Suppose that for some n an arrangement is given. The circles centered 
around these points with radius r = d/2 then form a packing inside an equilateral 
triangle with a side-length of 1+ 2V3r. The plane could be tiled with these 
triangles to obtain a global circle packing. For our purpose, however, this packing 
is not good enough. We will use a more economical packing shown in Figures 5 
and 6. The packing is reflected in a line parallel to one side of the triangle 
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Figure 5. Packing of packed triangles in a strip. 
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Figure 6. Tiling with truncated triangles. 


touching the circles. This mirror image is then slightly moved until it fits snugly 
into the original arrangement (see Figure 5). It is easy to see that this can always 
be done. This process is repeated to obtain an infinitely long strip of circle 
packings. The same technique is then applied to the strip resulting in a global 
circle packing. This is possible because in the strip the arrangement of circles 
repeats itself after six triangles. The side-length of the triangles in Figure 5 is 
1 + 3r. Although these triangles overlap, the plane can be tiled by the trapezoids 
as shown in Figure 6 (the shaded regions correspond to mirror images of the 
arrangement). It is a well-known result of Thue [16, 4] that the density of a plane 
circle packing cannot exceed 7/ ¥12 and that this maximum value is attained for 
the honeycomb packing where each circle touches six neighbors and the centers 
are on a regular triangular lattice. This implies the following estimate: 


nar T 


which leads to inequality (2). For triangular numbers n = k(k + 1)/2, this in- 
equality reduces to d < 1/(k — 1); in this case the hexagonal packing is the 
unique optimal solution. a 


3. CONJECTURES. For n = 13, 14,19, conjectures for the optimal arrangements 
are presented in Figure 2m, n, r. The optimal arrangements for n = k(k + 1)/2 — 
1 seem to be obtained by removing one arbitrary point from the arrangement for 
n = k(k + 1)/2. This conjecture was posed as an open problem by Erdés and Oler 
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[12, 2]. We have already shown its validity for n = 2,5,9. The conjecture actually 
implies a still open conjecture of Fejes Téth [3], which states that if n + 1 circles 
are removed from the honeycomb packing of equal circles, and n are packed again 
in the resulting interstitial space, then we always end up with the original packing 
from which one circle has been removed. 

The configurations in Figures 2c, g, m, r suggest a possible form for the optimal 
arrangements for n = k(k + 1)/2 — 2, (k > 3). First k — 3 layers of (k — 3) 
equilateral triangles, followed by a layer of k — 3 pentagons. We conjecture that 
these are the unique optimal configurations in these cases (up to rotations). The 
conjecture is true for n = 4 and 8. 

Conjectures for n = 16,17 and 18 are presented in [9]. One configuration is 
shown in Figure 2q. 
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Partnerships’ 


Alan H. Schoenfeld 


In the section called “Action,” Everybody Counts (National Research Council, 
1989) issued this clarion call: 


“In the next decade, the United States has a historic opportunity to revitalize mathematics 
education... 


“There are at this time both a particular urgency and a special opportunity for reform of 
mathematics education. Since mathematics is the foundation of science and technology, reform 
is needed to prepare the more highly skilled work force that the nation now needs. Because of 
the emerging general agreement within the mathematics, mathematics education, and related 
professional communities on goals for mathematics education and means for achieving them, 
there is at this time a special opportunity for the nation to push ahead boldly in this area of 
education. (page 87)” 


The mathematics education community has indeed been pushing boldly ahead, and 
it is of great interest to note the character of the advances—especially as they 
contrast with the character of the field in-its early days, approximately a quarter- 
century ago.” For example, Joe Crosswhite recalls that the first research sessions at 
an annual NCTM meeting were held behind the stage, behind a closed 
curtain—placed by conference organizers at a safe physical and psychological 
distance from more “teacherly” conference activities. Physically and intellectually, 
the research community stood apart. Indeed, its apartness was manifested in 
multiple ways: in focus, in methods, and in the communities from which it drew. As 
in all of the social sciences through the 1960’s and 1970’s, the methods employed 
tended to be “rigorous” and “scientific,” with a focus on experimental studies and 
statistical analyses. Many experiments took place in the lab, at some remove from 
instruction. Those studies which took place in classrooms tended to downplay the 
complexity of classroom interactions, focusing on specific instructional “variables” 
and their effects, as determined statistically. Hence in 1978 Kilpatrick felt obliged 
to suggest that educational researchers might have lost sight of meaningful mathe- 
matical behavior in their search for methodological rigor, and that the community 
might have much to learn from unrigorous but interesting studies such as the 


This report was prepared by Alan H. Schoenfeld, University of California at Berkeley, chair of the 
NCTM Research Advisory Committee, and was reviewed by members of the Committee. At the time 
this report was prepared in April 1993, committee members were Deborah Ball, Michigan State 
University; Robert Davis, Rutgers University; Beverly Ferrucci, Keene State College of New Hamp- 
shire; Marilyn Hala (Staff Liaison), NCTM Headquarters; Miriam Leiva (Board Liaison), University of 
North Carolina at Charlotte; Susan Jo Russell, TERC; William Tate, University of Wisconsin. 
Reprinted with permission from the JRME, copyright 1993, by the National Council of Teachers of 
Mathematics. 

* Papers in mathematics education can be traced back a good many years, of course, but the creation 
of the Journal for Research in Mathematics Education about 25 years ago is generally taken as a sign of 
the coalescence of the discipline. 
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largely qualitative teaching experiments carried out in the Soviet Union by re- 
searchers such as Krutetskii (1976). In terms of communication across communi- 
ties, Pélya was the exception that probed the rule: after the burst of energy that 
produced the New Math, there was little interaction between the mathematics and 
the math-ed communities, especially along the lines of research. 

Things have changed! As noted in Everybody Counts, ‘real change requires 
action by everyone involved in mathematics education” (page 93). The Mathemati- 
cal Sciences Education Board, formed in 1985, represents an attempt to bring 
together the various constituencies that have a stake in mathematics education. 
Multiple communities have a stake in getting things right. More importantly, 
multiple communities have major contributions to make. 

Mathematicians, for example, live and breathe the discipline; they can offer a 
deep sense of what it is to engage in mathematics, and a sense of what might be 
called the ‘“‘mathematical validity” of a curriculum—whether the ideas and pro- 
cesses with which students engage tend to reflect the deep underlying notions of 
mathematical “doing.” In recent years the mathematical community’s interest in 
educational issues has mushroomed: witness the existence of Mathematicians and 
Educational Reform, a grass roots organization of university mathematicians with 
interest in contributing to K-12 mathematics education, and the fact that the 
American Mathematical Society has created a Committee on Education, one major 
function of which is to establish liaison with other, longer-established groups with 
educational interests. 

In many ways the teaching community has been galvanized by the Curriculum 
and Evaluations Standards for School Mathematics (NCTM, 1989) and the Profes- 
sional Standards for Teaching Mathematics (NCTM, 1991). The wisdom of the 
profession was a major factor in the creation of those documents, and will be an 
essential resource if we are to reach to goals set forth in them: teachers live the 
reality of instruction in their classrooms, and must be the wellspring of the reform 
movement. And the professional teaching community is ready for interactions with 
the other communities, as evidenced by the spectacular growth in NCTM member- 
ship and attendance at annual NCTM meetings in recent years, and the diversifica- 
tion of conference programs to include a significant focus on research-related 
activities. 

Beyond the classroom, schools, school districts, parental understanding and 
influence, state departments of education and national curricular influences (texts 
and tests) are major factors that affect the ways in which reform can take place, 
and whether it will be sustained. Members of all these communities need to be 
enfranchised, and need to contribute to dialogue and change. 

Last but not least, the community of mathematics educators has grown spectac- 
ularly over the past 25 years, and is capable of being a central ‘‘team player” in the 
reform of the profession. Even a cursory glance at the Handbook of Research on 
Mathematics Teaching and Learning (Grouws, 1992) reveals how vibrant and robust 
an enterprise research in mathematics education has become. A closer look reveals 
how much the field has broadened, in the range of methods it employs and the 
phenomena it explores. Methods include computer simulations of individual cogni- 
tion, clinical interviews, classic laboratory studies, ethnographic analyses of class- 
room cultures, qualitative studies of teacher and student beliefs and their effects 
on behaviors, and more. The classroom, once seen by most as “too complex’ for 
careful studies of mathematical thinking and learning, is now seen by many as the 
natural place for such studies. Along with inward growth came an outward look: 
the mathematics education community now looks to teachers, mathematicians, 
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psychologists, cognitive scientists, anthropologists, and numerous other communi- 
ties for issues, ideas, and inspiration as it seeks to grapple with the complex 
phenomena of mathematical understanding, thinking, and learning. 

We are, then, at an important point in the development of mathematics 
education. There is general recognition that the problems we face are large, and 
that they require the concerted effort of all the major constituencies involved in 
the educational process. Although many of those constituencies have in the past 
been communities apart, there is now unprecedented potential for collaborative 
work and joint community building. Over the past few years, the Research 
Advisory Committee in particular and NCTM in general have been moving in 
those directions. Here are some examples of recent, proposed, and potential 
projects. 

Two years ago (July 1991) RAC reported on the NCTM Standards Research 
Catalyst conferences, which were then in progress. One major goal of the confer- 
ences, supported by the NSF and held in March and December 1991, was to focus 
research on major themes in the Curriculum and Evaluations Standards for School 
Mathematics. The profession needed to know more about assessment, curriculum 
change, communication, policy, representational tools and models, and the chang- 
ing secondary curriculum; it made sense to have focus groups address those issues. 
But an equally important goal was the enfranchisement of a new research commu- 
nity, reaching out from the traditional base of mathematics educators to teachers, 
administrators, and others to begin research and research partnerships in these 
areas. By any measure, the effort was a significant success: a number of new 
researchers received NSF seed grants for work stimulated by the conference, and 
some of the partnerships formed (e.g. the communications group) continue today 
as active research collaboratives. 

We hope, a few years from now, to report on a similar undertaking related to 
the Professional Standards for Teaching Mathematics entitled the “Collaborative 
for enhancing research in mathematics teaching.” The goal of the proposed 
collaborative is to build a community of people working together to conceptualize 
and carry out research on mathematics teaching, with an emphasis on broadening 
the kinds of research being used to inform reform efforts and exploring new ways 
to communicate about research to diverse audiences. The collaborative is espe- 
cially interested in attracting new researchers, experienced researchers new to 
research on mathematics teaching, mathematicians, and mathematics educators 
whose activities have not traditionally been considered to be research (e.g. class- 
room teachers, staff developers, administrators, college teachers). 

With the help of the Exxon Education Foundation, work is now under way on 
the first phases of a project entitled “Recognizing and recording reform in 
mathematics education: Documenting the effects of the National Council of 
Teachers of Mathematics Curriculum and Evaluations Standards and Professional 
Standards for Teaching Mathematics.” This project, quite large in scope, is in- 
tended to take a systemic view of change, and to help the community at large 
understand the dynamics of educational reform. This project will, of necessity, 
involve all the major constituencies involved in mathematics education. From the 
project description: ‘Such a project, through its structure and intent, emphasizes 
that the changes outlined in the Standards documents will not happen quickly, or 
easily, or without experimentation and false starts. A project such as this confirms 
that it is not only acceptable, but essential, to learn from the process of implemen- 
tation and change and to disseminate and share that knowledge openly, even 
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though the stories that emerge will describe obstacles and difficulties as well as 
successes.” 

Finally, a set of activities on “‘Partnerships in research” is in the planning 
stages. The task force working on the project expects to assemble videotapes of 
classroom instruction that can serve as the focal points for conversations among 
mathematicians, teachers, administrators, and mathematics education researchers 
regarding the values, goals, and practices of mathematics instruction. It hopes that 
first at a national conference, and then at a series of spin-off local conferences, the 
videotapes and related support materials will serve as means of facilitating conver- 
sations amnong those groups, all of which are essential for continued progress in 
educational reform. 

These are exciting times. The spirit of reform is in the air; the communities 
necessary to promote it are open to collaboration; and efforts to join forces in this 
important collaborative enterprise are being undertaken. That the various commu- 
nities listed above have grown to the point where they recognize their interdepen- 
dencies and are willing to build partnerships bodes well for all concerned, and 
should cheer us all—but it should not leave us feeling complacent. We have just 
embarked on the collaborative trail, and there is much more to be done. Although 
one can point to exceptions in individual states and locales, the research commu- 
nity has not, in general, been adequately engaged with policy makers at the state 
and national levels. Local, state, and national policies may or may not be consis- 
tent with our best understandings. Likewise, local, state, and national assessment 
measures may support or may undermine what we would like to have happen in 
our nation’s mathematics classrooms. Much more direct contact and productive 
interaction among the policy, assessment, and research communities is necessary. 
Similarly, although there are encouraging signs of interactions, the research 
community has‘yet to engage adequately with issues of teacher preparation. And, 
of course, this brief list of necessary collaborations can be expanded without 
difficulty. In sum, let us take pleasure in the progress we have made. Then, let us 
return to the task of making and strengthening essential partnerships for progress 
in mathematics education. 
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A Simple Proof of Pascal’s 
Hexagon Theorem 


Jan van Yzeren 


Pascal’s Theorem. If the vertices of a hexagon lie on a circle and the three pairs of 
opposite sides intersect, then the three points of intersection are collinear. 


This theorem was published in 1640 by sixteen-year-old Blaise Pascal. His 
original proof has been lost, and at times one wonders whether one or another of 
the known proofs is, in fact, Pascal’s original one. This also applies to the simple 
proof given here. 

Begin with the hexagon A,, i = 0,...,5 of Figure 1, and consider the circle 
through the points A,, A, and P,, where the first two points are (opposite) 
vertices, and the last is one of the “Pascal points” connected to them. This circle 
meets A,A, and A,A, at B, and B, respectively, and one uses arcs of the circles 
shown to find equal angles inscribed in them (or supplementary angles inscribed in 
opposite arcs). As a consequence, the triangles P,B,)B, and P,A,)A, have respec- 
tively parallel sides, that is, they are perspective from the point P,). Therefore, Po, 
P, and P, are collinear. 


Figure 1 


The proof also covers the case of A,A,||A3A, (ie., Py at infinity). Then, the 
triangles are translative, that is, P,P, is parallel with A,A, and A,A,. The only 
special case not covered by the proof concerns hexagons inscribed in a circle with 
parallels as opposite sides. This case, however, follows easily from appropriate 
arcs. 

Whether Pascal gave this proof is open to debate, but it seems that this proof 
has not turned up for 350 years. On this point Professor Coxeter kindly has 
commented as follows: “It is indeed remarkable that this elegant proof was not 
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found in 350 years, and also somewhat remarkable that Guggenheimer came close 
to it in 1967 and then felt obliged to introduce a peculiar lemma.” [3] 

Anyway, the historic delay justifies some special attention for the heuristics of 
this simple proof. 

The basic figure consists of two pencils of four lines joining points on a circle, 
viz. (Figure 2) Ay and A, with, respectively, A,, A,, A; and As. 


Figure 2 


Evidently, the two pencils are congruent (equal angles between corresponding 
lines). Therefore, if AA,A,Q is made similar to AA,A,A,, the segments A,Q 
and A,A, are divided proportionally and 4,A,//P,R//ST. Now, the crucial 
idea is to build up this basic figure in a converse manner, starting with two given 
similar triangles: AA, A,Q ~ AA,A,A, and forgetting the circle. 

Then, choose P, and R on, respectively, A,A, and QA,, such that P,R//A, Ao. 
Similarly S and T. Hereupon the following points are defined: A; = A,P, N AoR, 
A, =A,S NAoT, Py = AygS NAA, P, = Ay RN AAS. 

To prove that Py, P,; and P, are collinear: 

Consider ARA,U, RU//P,A3, and its translative image AP,B,B,. Then, B, 
lies on Pj Ay as Py) A, //P,R, and B, lies on P,)A3, because P,B, = RU =A,A,° 
RT/A,T = A,A,:P,S/A,S. Therefore, the triangles P,B)B, and P,A,A, are 
perspective from the point P, and, indeed, P), P, and P, are collinear. 

Afterwards the crucial points B, and B, can be found directly. In fact, they lie 
on the circumcircle of AP,A,A,, because 2 P,B)A, = 2A;A) A, = ZA5A,A, 
= ZP,A,A, and 2A,B,B) = 2A,A 3A = LAA Ap = 2A,A,Bo. Actually, 
drawing the circumcircle of AP,A,A, is the very point of the new proof. 

Background of the heuristics is the fact that the metric of the Euclidean plane 
can be defined by giving a pair of similar triangles. After that, all other metric 
properties must follow by means of parallels and proportionalities (affine tools). 
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The Mathematical Relationship Between 
Kepler’s Laws and Newton’s Laws 


Andrew T. Hyman 


1. INTRODUCTION. Whenever a new scientific theory comes down the pike, it is 
greeted by skeptics who demand proof that the new theory is as good as the theory 
it would displace. That is why “the major scientific problem of the [seventeenth] 
century” was to prove that Isaac Newton’s law of gravity gives the same correct 
results as the older laws of Johannes Kepler [4]. This famous mathematical 
problem is solved below in an innovative way that requires no trigonometry, only 
elementary calculus, and none of the usual “clever tricks’’ [8]. 

Supposing that planets move according to Kepler’s Laws (which are reviewed in 
Section 2 below), then it follows that planetary acceleration is given by Newton’s 
central inverse-square equation (which is equation twelve below). This historic 
theorem was first proved by Newton, who thereby established his law of gravity as 
a respectable successor to Kepler’s Laws. This same theorem is proved in Section 
3, using simple and straightforward methods. The reverse theorem, according to 
which the central. 1/R* equation requires Keplerian orbits, is proved in Sec- 
tion 4. 

The two theorems proved here were first published in Newton’s 1687 
Philosophiae Naturalis Principia Mathematica, or Principia for short. Newton admit- 
ted that the Principia is purposely “abstruse” ({3], p. 90), and a controversy persists 
as to whether Newton’s proofs are entirely legitimate ((2], p. 30). Unlike the 
Principia, the brief proofs below are quite transparent. 

Kepler’s Laws are differentiated in Section 3 using only Cartesian coordinates, 
and this novel Cartesian approach contrasts with the usual technique of transform- 
ing to polar coordinates. Although the converse proof of Section 4 is fundamen- 
tally the same as those of a few other authors ([5], p. 178 of [11], and 
p. 625 of [1]), each step in Section 4 follows naturally and inexorably from what 
precedes it. No rabbits are pulled out of hats. The method of Section 4 is thus 
presented in a clear manner which compares favorably to the more common 
methods of solving the same problem, and also to various uncommon methods 
which are discussed in [10]. 


2. REVIEW OF KEPLER’S LAWS. Kepler deduced his laws from data supplied 
by the astronomer Tycho Brahe. Kepler’s Laws are: 


I. Each planet moves along an ellipse with the Sun at a focus. 
II. The line from a planet to the Sun sweeps out equal areas in equal times. 
III. The square of a revolution’s duration, divided by the cube of the orbit’s 
greatest width, is the same for all planets. 


932 NOTES [December 


Kepler introduced the first two laws in his 1609 Astronomia Nova. The third or 
“harmonic” law was suggested in his 1619 Harmonice Mundi, and is often stated in 
terms of the length “a” of the semimajor axis (“‘a” is half the orbit’s greatest 
width). The discovery of these laws marked the greatest advance since Aristarchus 
deduced nineteen centuries earlier that planets circle the Sun (see p. 2 of [6]). 

Ellipses are, of course, the closed curves formed by intersecting a cone and a 
plane. They were studied by the ancient Greeks (see p. 119 of [6]) who proved that 
the distance to a point (the “focus”) divided by the distance to a line (the 
“directrix’’) is a constant “eccentricity” «. A beautiful proof of this focus-directrix 
property was devised in 1822 by G. P. Dandelin. Dandelin’s proof appears at 
p. 546 of [9], and it applies to both closed (0 < « < 1) and open (e > 1) conic 
sections. 


y Directrix 


Planet 


Figure 1 


Kepler’s Laws can be translated into equations by picturing a planet as a 
point-particle in the x-y plane, having coordinates (X,Y) at time ¢ (see Figure). 
The Sun is located at the origin, and the planet’s directrix is perpendicular to the 
x-axis at a distance D/e from the Sun. “D” is called the “semi-latus-rectum” of 
the conic section. According to Kepler’s First Law, the distance R = VX* + Y? 
from the planet to the Sun is given by: 


R=D-—-eX. (1) 
Kepler’s Second Law can be formulated in similarly simple terms. If the planet 
crosses the y-axis at time t,, then the area swept between ft, and ¢ equals the area 


under the curve minus the triangular area beneath the line from Sun to planet. 
Hence, at all times, 


[oyde ~ X¥/2 = C(t ~ to) (2) 


where “C” is the constant ratio -of area swept to time elapsed (a new constant tf, 
must be introduced whenever the planet crosses the x-axis). 

The orbit’s total area divided by a revolution’s duration is clearly equal to C. 
Also, it is not difficult to prove that the area of an ellipse is wab with a = D[1 — 
e*]~! and b = D[1 — e7]~!/”7 (these two equations can be easily derived using 
equation one). Therefore, Kepler’s Third Law is: 


C?/D=K (3) 
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where the constant ‘“‘K”’ is the same for all planets. In summary, Kepler’s Laws are 
(1), (2), and (3). 


3. PROOF OF CENTRAL 1 / R? EQUATION. Kepler’s Laws will now be used to 
find the acceleration of a planet. Differentiating (1) produces: 


1 xX in yt dX 4 
Rita (al a (4) 


Differentiating (2), using the Fundamental Theorem of Calculus, gives: 
Y— -—- X— =2C. 5 
Tt (5) 


A bit of algebra applied to (4), (5), and (1) makes it clear that the two velocity 
components are: 


dX 2C Y 
ad DR 6) 
and 
dY 2C X 2Ce 
a@ DR D- ) 
Differentiating (5) yields: 
d’*X d*Y 
7 7 XxX Te 0. (8) 


Differentiation of the right-hand-side of (6) is facilitated by the following identity: 
d|Y xX dY dX 
dt| R R? dt dt | 


(9) 


This identity is based solely upon the definition of R.* By differentiating (6) and 
plugging in (9), (5) and (3) one gets: 


d*X —-4KX 
| TF 7 TR (10) 

By (8) and (10), 

d*Y  —4KY 

ae RE a) 
Equations (10) and (11) can be written compactly in terms of vectors. 

d?R  —4KR 

aR 2) 


Equation (12) is Newton’s central inverse-square equation. This equation expresses 
Newton’s law of gravity for the special case where planetary mass is negligible. 


"Incidentally, note that [XY — YX] is twice the areal speed (i.e., R76 in polar coordinates), where 
dots denote differentiation. The referee has keenly observed that therefore equation (9) is basically 
[sin 8)'= [cos 6]@. 
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4. RECOVERY OF KEPLER’S LAWS. It remains to be seen whether a bounded 
orbit could satisfy (12) if it is not Keplerian. In other words, could a planet be 
accelerating according to (12), and yet violate Kepler’s Law? It will now be proved 
that such an orbit is impossible, by recovering Kepler’s Laws from (12). By the way, 
it is taken for granted that motion is confined to a plane, though this assumption is 
easily justified ((7], p. 105). 

Equations (10) and (11) lead to (8), and integrating (8) retrieves (5) and (2). 
Plugging (5) into the crucial identity (9) gives: 


d\|Y —2CX 
“5 | — Re (13) 
On account of (13) and (10), 
Y C ax 
RK a *4 (14) 


where “A” is a constant of integration. 

The identity (9) has been very useful here, and it would have been necessary to 
pull this identity out of thin air were it not for the context provided by Section 3. 
In this context, the identity (9) has arisen in a natural way (whereas other authors 
have indeed pulled this identity from out of the blue). 

Interchanging “X” and “Y” in (9) produces another identity which together 
with (5) yields: 


d| xX 2CY 
<5 <<. (15) 
So, by (11), 
xX —-C aY 
ROR GW + B (16) 


where “8B” is another constant. Plugging (14) and (16) into (5) yields: 


2 
= + AY + BX=R. (17) 


If A = B = 0, this describes a circle. If not, (17) represents a conic section with 
focus at the origin, eccentricity [A” + B?]'”*, and directrix given by: 


C2 

— +Ay + Bx = 0. (18) 
K 

This interpretation of (17) follows from a simple fact of analytic geometry: the 
distance from a point (x9, y,) to a line ax + by + c = 0 is equal to lax, + byy + 
cla? + b?]~'/2. This well-known fact can also be applied to (18) in order to find 
the distance from focus to directrix, and it is thus evident that the focus-directrix 
distance is as described by (3). Consequently, if Newton’s central inverse-square 
equation holds true then all bounded orbits must satisfy Kepler’s Laws, which was 
to be demonstrated. 
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A Short Proof of a Result on Polynomials 


Razvan Gelca 


In this note we want to present a short proof of a result that appeared in [1]. For a 
polynomial f(x) = I17(x — x;), with distinct real roots x, <x, < --: <x,, we 
let d = 6(f) = min(x,,, —x;) and g(x) = f(x) /f(x) = Li /( — x;). If k is a 
real number then the roots of the polynomial f’ — kf are also real and distinct. 


Proposition. If for some j, yg and y, satisfy yy <x; <y, < Yq + d then yy and y, 
are not zeros of f and g(y,) < g(y,). 


Proof: The hypothesis implies that for all i, y; —y) <d <x;,, —x,;. Hence for 
1<i<j-—1 we have yy —x;>y,—X;4,>0 and so 1/(yy — x;) < 1/0), - 
X;41); similarly for j <i<n-—1 we have y, —x;,; <Y¥o —x; <0 and again 
1/(y9 — x) < 1/0), — Xia). 

Finally y) —x, <0 <y, —x,, so 1/(yy — x,) < 0 < 1/(y, — x,), and the re- 
sult follows by addition of these inequalities. 


Corollary. 6(f' — kf) > d(f). 
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Proof: If y,. and y, are zeros of f’ — kf with y, < y, then they are separated by a 
zero of f and satisfy g(y,) = g(y,) = k. Hence from the proposition we can not 
have y, <y,) + d, sO y, — yy > d as required. 
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Two Amusing Dynkin Diagram 
Graph Classifications 


Robert A. Proctor 


Here are a couple of simply stated graph classifications which can be used to 
amuse and amaze students and friends during tea or cocktail parties. It’s fun to 
watch non-mathematicians theologically wrestle with the following notion: Mathe- 
maticians can prove that no one can come up with any solutions beyond the ones 
shown in the figures. Many people have been aware of the first classification for 
some time. The second one is an immediate consequence of a well known fact, but 
perhaps has not been formulated in this way before. 

A simple graph is a graph which has no loops or multiple edges. I'll call it 
labelled if a positive real number has been assigned to each vertex. 


Problem 1. Find all connected labelled simple graphs whose labels satisfy the 
following condition: Twice any label is equal to the sum of the labels of the adjacent 
vertices. 


Answer. If you check this condition nine times, you can verify that the labels of the 
last graph in FiGureE 1 satisfy this requirement. For example, at the central vertex 
we have: 2X 6=4+5 + 3. Surprisingly, up to an overall scalar multiple of the 
labels, all possible connected graphs labelled in this way are shown in FiGurReE 1! 
There are two infinite families of solutions and then three specific peculiar 
“exceptional” solutions. 


Problem 2. Find all connected labelled simple graphs whose labels satisfy the 
following condition: Twice any label minus two is equal to the sum of the labels of the 
adjacent vertices. 


Answer. The only possibilities are shown in FiGure 2. At the central vertex of the 


last graph we have 2 X 270 — 2 = 182 + 220 + 136. Again there are two infinite 
families of solutions followed by three exceptional solutions. 
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Figure 1 


I’ve used Problem 1 to intrigue friends and students for years, but only recently 
did I notice Problem 2. I like it better than Problem 1 because the labels are much 
more entertaining, and because it’s easier to explain the significance of its graphs 
to beginning graduate students: Without the labels, the graphs shown in FiGureE 1 
are the extended Dynkin diagrams of types ADE, whereas the graphs of FiGuRE 2 
without their labels are just the ordinary Dynkin diagrams of types ADE. These 
play a role in the classification of simple Lie algebras (or groups), whereas the 
extended diagrams are used to help classify a more sophisticated family of objects, 
the affine Lie algebras. (Also, the solutions to Problem 2 are unique immediately, 
without the “overall scalar multiple” fine print needed with the solutions to 
Problem 1.) 

There are many kinds of algebraic and geometric structures arising in mathe- 
matics which are “classified” by a list of some kind of Dynkin diagrams. For 
example, one could ask what are the possible finite subgroups of the orthogonal 
groups O(n,R) which are generated by reflections. If we ignore the dihedral 
groups and require that the subgroup fix only the origin, then there is exactly one 
such subgroup for each member of the following list: A,(m = 1), B, (nm = 2), D,, 
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1(n) 3(n—2) (k+1)n—k) (n)1 
@.=.@- == o e oe @-==@ o e e@-———-@ 
2(n — 1) k(n —k +1) (n —1)2 


(n — 1)n/2 
1(2n — 2) 3Q2Qn-4) (n-—3Xn+2) 
eco@ece (n —2)(n + 1) 
22n-3) kQn-—k—1) 
(n—-1)n/2 
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n > 4), E,, E,, Eg, F,, Go, |,, and l,. Here each X, denotes a particular “Coxeter 
diagram” which has n nodes and which describes a particular subgroup of O(m, R) 
up to conjugacy according to a certain recipe. These diagrams (shown on page 57 
of [BG]) look similar to those appearing in our figures, but the edges are labelled 
and the vertices are unlabelled. For the details of this classification consult [BG], 
which was written for undergraduates. The names of our diagrams in FIGURE 1 are 
AD, DY, EW, ES, and E® and in Ficure 2 are A,, D,, E,, E;, and Eg. Although 
some aspects of the diagrams and the membership of the list can vary, it is usually 
readily apparent when a classification by Dynkin diagram-like objects is occurring. 
The diagrams of type E look quite distinctive, and it seems that one always has at 
least two diagrams of this type arising. The set of possible simple Lie algebras over 
C is indexed by the Dynkin diagrams A, (n = 1), B, (n = 2), C,, (n = 3), D, 
(n > 4), E,, E;, Ex, F,, and G,. These diagrams (shown on page 58 of [Hm1]) are 
exactly what is meant by “Dynkin diagram.” They are similar to Coxeter diagrams, 
except that some of the edges‘are directed. Usually the structures of the objects 
indexed by the version of Dynkin diagram at hand of type A, type D, or type E are 
easier to deal with than the structures of the objects indexed by the diagrams of 
other types. This fact is reflected at the diagram level by the diagrams of types 
ADE having all “easy” edges. The overall phenomenom of classification by 
Dynkin-like diagrams has many fascinating and mysterious aspects; consult [HHSV] 
for a survey. 
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Although the answers to Problems 1 and 2 are stated in terms of graphs, they 
are actually theorems in linear algebra! To see this, do the following: Let the 
variables x,,x>5,... denote the as yet unknown vertex labels. In either problem, 
associate to each labelled graph with n vertices an n Xn system of linear 
equations. For example, the first two requirements of Problem 2 corresponding to 
the first two vertices of the first graph in FiGuRE 2 give rise to the equations 
2X, —X,=2and —x, + 2x, —x, = 2. For the graph of this form with 4 vertices, 
the system of equations giving all of the requirements for that graph is: 


2 —1 0 O}| x, 


0 -1 2 -1]\x, 
0 oO -1 2IIx, 


| 
NNN N 


In general, associate to any graph arising in either problem a matrix A = (a, ;) of 
the following form: The main diagonal entries a,, = +2; the off-diagonal entries 
a;; are —1 if vertex i is connected to vertex j and 0 otherwise. Conversely, given 
any n Xn matrix A with a, = +2 and a,,; = a,,= —1 for some pairs (i, j) and 

a;; = 0 otherwise, one could depict it with a simple graph wherein vertex i is 
connected to vertex j whenever a,; j = —1. We say that such a matrix A is 
connected if its corresponding graph is connected. Define three column vectors of 
length n as follows: v = (x, x5,...,x,)', 0 = (0,0,...,0)", and 2 «= (2,2,...,2)". 
Let’s say that v is positive if x; > 0 for 1<i<n. 

So Problem 1 (respectively Problem 2) actually is asking us ‘to find all connected 
matrices A of this form for which the linear system Av = 0 (respectively Av = 2) 
has a positive solution. With these formulations the answers stated at the begin- 
ning of this note are mostly derived on pages 47—54 of [Kac]. These eight pages 
can be read and understood by themselves, provided that you have had a good 
course in linear algebra. Kac is interested in such questions because the matrices 
A, known as generalized Cartan matrices in Lie theory, describe the structure of 
certain kinds of Lie algebras. On these pages he uses basic linear algebra 
techniques to investigate the existence of positive solutions v to systems of linear 
inequalities such as Av > 0 and Av = O, assuming that A has a certain form. The 
notions of positive definiteness and positive semi-definiteness play a key role. 

Here are the details. Part (ce) of Proposition 4.7 and parts (b) and (c) of 
Theorem 4.8 of [Kac] give the answer to Problem 1. For Problem 2 start with 
Theorem 4.3. By this result, since A cannot be of type (Aff) or (Ind), it must be of 
type (Fin). Then part (a) of Theorem 4.8 tells us that the graph S must be one of 
the graphs listed in Figure 2. My contribution is to supply the particular right 
hand side (2,2,...,2)', thereby forming Problem 2 as stated above. It is easy to 
check that each of the labellings given in FIGURE 2 meet the requirements. In each 
case only one such labelling is possible, since Theorem 4.3 of [Kac] tells us that 
det A # 0 whenever A is of type (Fin). 

The operator that doubles a vertex label and then subtracts the adjacent labels 
may be thought of as a discrete version of — A, where A is the Laplace operator. 

Now a few comments for people who are familiar with simple Lie algebras. The 
matrices A associated to each of the graphs of FiGure 2 are just the Cartan 
matrices for the root system [Hm1] associated to the graph. Multiplying a column 
vector from the left by A has the following interpretation: You are just converting 
a column vector of coordinates with respect to the simple root basis to the 
fundamental weight basis. Since the coordinates of the famous vector p = 6 are 
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(1,1,...,1)' with respect to the fundamental weight basis, the labels appearing on 
the vertices are just the coordinates of the vector 2p in the simple root basis. As 
such, they appear in tables such as those at the end of [Bou]. The extended Dynkin 
diagrams of FIGURE 1 can be understood in the context of ordinary root systems as 
follows. If you adjoin a vertex to the Dynkin diagram of a root system which 
represents the lowest root —f as described on page 95 of [Hm2], then in the ADE 
cases the diagrams of FIGuRE 1 will, result. The labels on the remaining vertices 
give the expansion of 8 with respect'to those simple roots. 

Why was I thinking about this recently? In 1980 the labels of FiGuRE 2 arose in 
my thesis (which was written under the direction of Richard Stanley). A member of 
my committee, George Lusztig, asked me if I knew of an existing interpretation of 
these mysterious positive integers. Last year while flipping through the recent 
[MPR], the numbers jumped out at me twelve years late: The typography of the 
tables in [Bou] was such that I hadn’t noticed them before. Fortunately, I was 
passed on my defense nonetheless! (The paper version of that chapter of my thesis 
[Pro] describes a Dynkin diagram classification of order diagrams of finite partially 
ordered sets. That result has a very similar flavor to the subject matter of this 
note.) 
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Professor Wedderburn’s request that the Association be represented on 
the Editorial Staff of the Annals of Mathematics by two associate editors was 
favorably considered. The Trustees authorized President Ford to appoint a 
committee of three, including himself, with power to select and nominate two 
associate editors of the Annals of Mathematics. President Ford appointed 
Professors Cairns and Slaught as the other members of this committee. It was 


understood that the Annals volume will be still further enlarged and it was 

felt that our subvention to the Annals is now inadequate. The Trustees, 
therefore, voted to increase the annual subvention to $300. 

American Mathematical Monthly 

34, (1927) p. 117 
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Postcards from Max 


As remembered by Paul Halmos 


Fditor’s note: Max Zorn died on March 9, 1993. 


Max Zorn was born twenty nine years before Zorn’s Lemma, and Zorn’s 
Lemma, the technique and the attitude, will go on living for centuries. For Max the 
lemma was a remark—the title of his paper on the subject is “A remark on 
method in transfinite algebra’; it was John Tukey who baptized the result. 


Max was a friend of mine, a good 
friend. We became acquainted in 1969, 
when I gave a colloquium talk at 
Bloomington. Max came to the tea 
before the talk—he came to tea every 
day, whether there was a colloquium or 
not—and, in accordance with his cus- 
tom, he came prepared. On a wrinkled 
slip of paper (it might actually have 
been the back of a used envelope) he 
had scribbled the questions he wanted 
me to answer—what is my opinion on 
the work of so-and-so?, how is this work 
connected with something I wrote ten 
years before?, has there been any 
recent progress along the lines of such- 
and-such? I don’t remember any collo- _ ; 
quium at which he didn’t ask a question afterward (and sometimes during)— —a 
relevant question, a pertinent question, a sharp question. His questions showed 
that he understood the subject, understood the talk, and was ready to understand 
and remember the answers. His questions were not intended to be embarrassing, 
but if the speaker was not thoroughly checked out on all aspects of the subject of 
his own talk, they could become embarrassing. Max didn’t mean to cause pain, and 
he cheerfully indicated a friendly acceptance of even a vague answer. 

Does everybody remember the Piccayune Sentinel? Yes, I spelled it right—the 
misspelling is Max’s own and I faithfully copied both c’s. I don’t know just when he 
started it; the first issue that I have a copy of is dated November 1950. It was a 
one-sheet affair that Max called the world’s smallest newspaper and that he gave 
to a few friends (usually by putting copies into his colleagues’ mailboxes, and 
rarely, for distant friends, by mailing them). One issue I have is labelled “partially 
late”. The contents of the Piccayune Sentinel were of the same kind as Max 
himself and his postcards (and as unpredictable and as confusion-inducing)—just 
longer and more widely distributed. 
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We were colleagues at Indiana for many years, and we had a routine: most 
afternoons we would troop over to the physicists’ common room in Swain West 
(the mathematicians couldn’t afford such a large and elegant place), get our coffee 
and cookies, and sit gossiping on the couch by the permanently curtained windows 
(heaven forbid that some unwanted light or air should enter). Our gossip was never 
malicious (well, hardly ever): it was about people in the profession (who is moving 
where and how much will he get paid?), about the profession (could square-sum- 
mable power series really be relevant to the Riemann hypothesis?), about local 
matters (who will teach what when and will that room be big enough to hold the 
class?)—and about books, about movies, about travel, about languages, about 
anything that had a momentary or a permanent interest for at least one of us. We 
never ran out of subjects; I looked forward to our meetings, and when some 
catastrophe prevented one, I missed it. 

One conversation we had bothered me afterward, and I was moved to write 
down my concern in a letter to Max. The letter didn’t go through the U.S. Mail—I 
just put it into Max’s box. Here is what I wrote. 

“Something you said yesterday worries me—I kept thinking about it during the 
night and it kept worrying me. You said that you had bad judgement and that you 
were a failure—two statements with which I thoroughly disagree. 

“Of all people I know you are the one who has the sharpest, finest, clearest 
insight into all of mathematics, and, for that matter, into most of human life. Your 
tastes and mine (in mathematics, and sometimes in other things) are not always the 
same—but your insight and your judgement are impressive. 

As for failure: that’s nonsense. You are respected by everyone who knows you 
(and by thousands of others), and you are liked by everyone who knows you or ever 
came anywhere near you. You are a mathematician. Most people no longer know 
whether your work was algebra, or complex functions, or something funny about 
semigroups, or whatever—but they respect you for your reputation, for (if you'll 
pardon the expression) your lemma, for your questions, for your wit (a non- 
accidental cognate of Wissenschaft), for your understanding. You have written, you 
have taught, you have inspired—is that a failure? I wish I were one!”’ 

Max answered me with a hand-written note that I found in my box the next day. 

“In school I heard: 

Eigenlob stinkt, 
Freundeslob hinkt, 
Feindeslob klingt. 
(But) thanks.” 
I wasn’t quite sure of all the verbs, so I checked them in a dictionary; roughly (not 
too roughly) they mean stinks, limps, and rings (respectively). 

I left Indiana twice—meaning that I accepted an invitation from another 
university, moved, returned after a couple of years, and some years later moved 
again--and we started corresponding. The- first time, when at tea one day I said 
“Max, I’ll be leaving’, he said “For bad?”. That, by the way, is typical of his use of 
language—he knew idiomatic English perfectly, and had enough control over it 
that he could twist it to communicate delicate shades of meaning elegantly and 
efficiently. 

He was an unpredictable correspondent. I am a garrulous one—I tend to write 
repetitive letters full of many details that probably no one besides me is interested 
in—and he varied from stories with smiles in them to almost brusquely short 
hello-good-byes. As the years went on, I got in the habit of writing him a longer 
letter (three or four single-spaced pages) approximately once a month, and he got 
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in the habit of a short and mysterious friendly postcard approximately twice a year. 
The operative word is mysterious: he used abbreviations that he invented as he was 
writing, and with them he referred to happenings, past and future, that I had no 
way of knowing anything about. Every now and then I really wanted to understand 
the latest mystery and I demanded an explanation (in my next letter, or even by 
telephone)—and he was always goodnatured about it, and while seemingly puzzled 
that someone could fail to understand something that clear, he cheerfully ex- 
plained. The result was sometimes understandable. 

The first Zorn letter that I saved was one of the long friendly kind, two 
hand-written pages, and it is signed: “‘as never before, Max”. A few years later 
another letter ends with: “as before, Max’. One postcard consisted of the follow- 
ing sentence: “If f(x,y) is such that f(1, y), f@, y),..., f(m, y),... are com- 
putable, then I want f(x, y) to be computable”. A couple of years later (again a 
postcard): “A I (Nach Kant ist die Existenz des eigenen Ich nicht trivial.) Still 
another: “Is the symbol of the symbol (defined and) the same as the symbol?” 
Again: “Is a random variable a function or an equivalence class of functions?” 
And: “Sum, ergo dubito.”’ One letter I received from Max was one typewritten 
page, on the back of which appeared a backward carbon copy of the same letter, 
and a handwritten footnote: “You can see that I tried to keep a copy. Long live 
Freud!” 

The letters and postcards came oftener at the beginning than later on—perhaps 
six to eight times a year—toward the end I was lucky if I got two a year. His last 
letter came in December 1992; it ends with “I plead fatigue, Max.” 

I miss Max. 
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UNSOLVED PROBLEMS 
Edited by: Richard Guy 


In this department the MONTHLY presents easily stated unsolved problems dealing 
with notions ordinarily encountered in undergraduate mathematics. Each problem 
should be accompanied by relevant references (if any are known to the author) and by 
a brief description of known partial or related results. Typescripts should be sent to 


Richard Guy, Department of Mathematics & Statistics, The University of Calgary, 
Alberta, Canada T2N 1N4. 


A Quarter Century of Monthly Unsolved 
Problems, 1969-1993 


Richard K. Guy 


A most valuable and timely contribution by Stanley Rabinowitz, which will hope- 
fully greatly reduce the amount of duplication and rediscovery that presently 
occurs, and will be a boon to all editors of problems sections of all kinds, is his 
series 


Index to Mathematical Problems 


of which Volume 1, 1980-1984, is available. It is obtainable from MathPro Press, 
Westford MA. 

References in brackets are to year and page numbers of this MONTHLY, while 
dates in parentheses refer to publications listed at the end, and other items are 
labelled (tbp) if they are likely to be published formally, or as written communica- 
tions (wrc) if publication plans are not presently known. Dates and pages in 
brackets are also appended to items in the bibliography indicating where the 
problem originally appeared in the MONTHLY. 

In [1969, 54] Victor Klee launched this section of the MONTHLY with the 
notorious equichordal problem, which goes back to World War I. It gets a mention 
in reviews by both DeTurck (1993) and Falconer (1993), but even as they wrote the 
problem was finally being solved by Marek Rychlik (tbp). 

The graceful graph [1969, 1128] bibliography is now best regarded as the 
purview of Joseph Gallian, to whom items should be sent. My somewhat out-of-date 
version contains 232 papers by 410 authors, only 169 distinct. 

Klee [1970, 63] asked for the maximum length of a d-dimensional snake, where 
by snake is meant a simple circuit in the d-cube whick has no chords. If we denote 
this maximum length (number of edges) by s(d), then Abbott and Katchalski 
(1991) show that s(d) > 77 x 2¢~®. Their paper contains a very good bibliography. 

Erdés and Guy [1973, 52] raised several questions concerning the crossing 
numbers of graphs; S¥kora and Vrito (1992) give the following lower bounds for the 
crossing number of the complete bipartite graph, the edge-skeleton of the n- 
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dimensional cube, and the complete graph, on an orientable surface of genus g. 


n 


4 
> — n29n-1 
v(Q,) 1500g n 2 


K m’n* mn(m +n) 
> eee 
vel Kon) > 70008 2 


n‘* n> 


K.)>—— 
Ve Kn) 60752 2 


Steven Finch extended Queneau’s computations of “Ulam sequences” [1973, 
919; 1975, 998; 1987, 962], a (u, v)-sequence of positive integers {a,} being defined 
by a, = u, a, = v and, for n > 2,4, is the least integer expressible uniquely as the 
sum of two distinct earlier members. Queneau showed that the (2, 5)-, (2, 7)- and 
(2,9)-sequences are regular in the sense that their differences are ultimately 
periodic. Finch (1991, 1992) proved that if the (u,v)-sequence has only finitely 
many even terms, then it is regular. Schmerl and Spiegel (tbp) prove that the 
(2, v)-sequence has just two even terms for any odd v > 3. 

Leech [1975, 923] asked, for each integer n, what is the greatest integer N such 
that there exists a tree with n nodes, and edges labelled with integers, in which the 
distances between pairs of nodes include the consecutive values 1, 2,..., NW? Here 
the distance is the sum of the labels on the unique path joining the nodes. Work of 
Gibbs and Slater (1991), Herbert Taylor (1991) and Yang Yuan-Sheng (wrc) has 
improved the results for paths and for more general trees to 


n 2 3 4 5 6 7 8 9 10 11 12 
path 1 3 6 9 13 18 24 29 37 45 (51) 
trees 1 3 6 9 15 20 26 34 41 = #£42(48)~ (55) 


where the entries in parentheses are not necessarily best possible. 

Joseph Gerver (tbp) and evidently Ben Logan before him in 1976, probably 
found the maximum area sofa that you can move round a corner [1976, 188 and see 
1977, 811 and 1991, 974]. A partial description of it is given by Ian Stewart (1992). 
Its boundary comprises three straight line segments, four arcs of radius 5, seven 
arcs of involutes of a circle, and four arcs of involutes of involutes of a circle. Its 
area is = 2.2195. 

In [1983, 35] I warned readers not to try to solve various problems, one of which 
was the notorious 3x + 1 problem. Two years later [1985, 3] Lagarias gave a 
valuable survey and bibliography. Recently he and Weiss (1992) have given two 
interesting stochastic models for the problem which independently produce the 
same constant y, ~ 41.677647 for limsup,_,.(a,{(n)/ln n), where o,{n) is the 
number of iterations of the famous function T(n) = n/2 (n even), T(n) = (3n + 
1)/2 (n odd) required to get to the value 1. 

In [1991, 974-975] we compared the problem of Forcade, Lamoreaux and 
Pollington [1986, 119; 1989, 905] with the special case asked by Basil Gordon. The 
papers of Chandler (1988) and of Forcade and Pollington (1990) are relevant. Blair 
Kelly JI] has done a computer search, revealing that n = 85 is the smallest 
counterexample. The next counterexamples are for 92 <n < 108, n = 112, n = 
113, 115 <n < 118 and 121 <n < 156. He says that it is natural to conjecture 
that there are no Gordon maps for n > 120. 

Tomaszewski [1986, 280] considered n real numbers a,,...,a, satisfying 


ba? = 1 and asked if, of the 2” sums of the form ¥ +a,, it is possible that there 
are more with |) +a,| >1 than there are with |X% +a,| <1. Holzman and 
Kleitman (tbp) establish the sharp lower bound 3/8 for the case where the 
inequality is strict, | +a,| < 1, but for the original problem the gap between 3/8 


and the conjectured 1/2 is still open. 
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Terry Raines (wrc) says that Erdés, and not your editor, was right: Pambuccian 
[1986, 627] asked for a(n), the smallest integer a for which there’s an integer b, 
0<b<a, atb, such that a+b, 2a + 5,...,na +b are all composite and 
asked if a(n) was always prime. Raines notes that for n = 135, a(n) = 8207 = 
29 - 283, with b = 3251. He has carried his computations to nm = 180, and each of 
150 <n < 173 provide further counterexamples. 

In [1988, 927] Tony Gardiner showed that the following four questions are 
equivalent: for which primes p, if any, (A) is (?? = 2mod p*? (B) is LY? fr l= 
0 mod p?? (C) is £?~!r~* = Omod p?? (D) does p divide the numerator of the 
Bernoulli number B,_? Scott Hochwald (tbp) notes that the questions are indeed 
equivalent, but that Gardiner’s final congruence is incorrect and should be 
replaced by the statement that 


Pp 
S,-3(P) + 


is divisible by p*. The only known prime was 16843, but on a recent visit to Calgary 
Richard Macintosh found a second example, 2124679. 

In [1989, 31] R. J. McG. Dawson asked if there was a subset of a square that 
contains disjoint connected sets A and B each containing two opposite corners, 
but does not contain two disjoint connected sets each containing two adjacent 
corners. Keith Whittington (1991) provides the counterintuitive affirmative answer. 

In [1989, 129] Clark Carroll asked for polynomials with integer roots whose 
derivatives all have integer roots. For cubics the answer is known and can be 
found, for example, in Walter (1987) or in Buddenhagen, Ford and May (1992); see 
also MONTHLY problem E3221, solved in [1989, 841-842]. For quartics, there are 
unpublished papers of Zagier (wrc), Buddenhagen and Ford (wrc) and the present 
writer, who may have been misleading in [1989, 907-908]. The situation is that for 
quartics with a repeated root there is an infinity of solutions, given essentially by 
the rational points on the elliptic curve y* = x? — 156x + 560, 57612 in Cremona 
(1992). It seems unlikely that there are quartics with all roots distinct, nor higher 
degree polynomials unless they have sufficiently many repeated roots, but these 
may still be open questions. 

In connexion with Sands’s guessing game [1990, 314], see Joel Spencer’s (1992) 
paper. 

I apologize that what were offered as unsolved problems in [1992, 74] are in fact 
well known results. Many of the big names in combinatorial number theory are 
among those who have written to say that Matiyasevich’s generalized harmonic 
numbers are essentially Stirling numbers of the first kind, and that his conjectures 
follow fairly easily from known properties. See especially Glaisher (1900), but also 
Nielsen (1906), Carlitz (1953), Olsen (1966) and Comtet (1974). 

In {1992, 178] John Connett asked if a bottle with an inside perfectly reflecting 
surface could be designed so that a beam of light shone into it was permanently 
trapped. Robert Dawson, Jan Mycielski and Lior Pachter immediately and inde- 
pendently designed such bottles; their results have been combined (1993). Other 
solutions were received from M. E. Taylor (wrc), from Madhu Vairy Nayakkankup- 
pam (wrc) and from the PCC Rock Creek Math Club—see Bercowitz et al. (wrc). 

The page numbers for Connett’s second reference should be 1113-1122. 

In connexion with the Gordon game [1992, 567] Bob Kibler writes: For the 
fourteen groups of order 16, White wins only in the cyclic case (by playing to 8). In 
D, White wins by playing to an element of order 5. In D- by playing to an element 
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of order 2. In Z,, and Z,, by playing to the element of order 2. Black wins in Z,., 
Z1,, D, and Z, X Z, X Z;. In Z,, White wins by playing to 6—does he also win 
by playing to 9? 

Fatin Sezgin (wrc) applied various tests to the Mycielski sequence [1992, 373] as 
a result of which he asserts that it cannot be considered as random. 

Neil Calkin notes the relevance of Peter Cameron’s survey article (1987) and his 
own thesis (1988) to Steven Finch’s 0-additive sequences problem [1992, 671]. 
Finch has calculated 15 million terms of the sequence {a,}, where {a,,..., a} = 
{3, 4, 6, 9, 10,17} and for n > 6,a,,,, is the least integer greater than a, which is 
not of the form a; + a,,i <j; without detecting any regularity (ultimate periodicity 
of the differences). Finch believes that this may be due to a massive initial segment 
of irregular values, while Calkin suspects that there may be counterexamples to 
Finch’s conjecture. They are preparing a joint paper. 

David Callan (wrc) solved Parker’s permutation problem [1993, 287] affirma- 
tively, and gave an alternative proof that it involves the Catalan numbers. Volker 
Strehl notes that the problem is not new, and has been solved both qualitatively 
and quantitatively. It appears, in the ‘Griggs’ version of the last three lines of 
[1993, 289], in various contexts: completion of latin squares, a bus scheduling 
problem, number of terms in the permanent of a circulant matrix. Marshall Hall 
(1952) attributes the problem to George Cramer, and generalizes and solves it for 
general finite abelian groups. Marica and Schénheim (1969) apply the result to 
latin square completion. Brualdi and Newman (1970) solve the enumeration 
problem by a method closely paralleling that of Gessel in the article under 
discussion. Chang (1979) uses Hall’s theorem and cites Marica and Sch6nheim. 
Salzborn and Szekeres (1979) prove Hall’s theorem but give no references to 
earlier work; their motivation was a bus scheduling problem. 
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PROBLEMS AND SOLUTIONS 


Edited by: 
Richard T. Bumby, Fred Kochman and Douglas B. West 


Proposed problems should be sent to the MONTHLY PROBLEMS address given on 
the inside front cover. Please include solutions, relevant references, etc. Three copies 
are requested. 


Solutions of published problems should arrive before May 31, 1994 at the MONTHLY 
PROBLEMS address given on the inside front cover. Solutions should be typed with 
double spacing, including the problem number and the solver’s name and mailing 
address. Two copies suffice. A self-addressed postcard or label should be included if 
an acknowledgment is desired. 


An asterisk (* ) after the number of a problem, or part of a problem, indicates that 
no solution is currently available. Partial solutions will be useful in such cases. 
Otherwise, the published solution is likely to be based on a solution which is complete 
and correct. Of course, an elegant partial solution or a method leading to a more 
general result is always useful and welcome. In addition, references to other 
appearances of MONTHLY problems or to solutions of these problems in the 
literature are also solicited. 


PROBLEMS 


10346. Proposed by David Doster, Choate Rosemary Hall, Wallingford, CT. 
Prove that, for all primes p, 

k?3 

Dp 


p-1 


_ (P=WE= V+) a 


k=1 
and 


M 3p -5 —2 —1 
§ [yp] - 2222-9 By 


where M = (p — 1p — 2). 


10347. Proposed by T. S. Nanjundiah, University of Mysore, Mysore, India. 


For integer n > 1, define real numbers R,, by 


k 
Ry 
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Prove that 


3 1 
yn-3 +5<R,<yn 


+ 
Ale 
+ 
Nl|e 


for n > 1. 


10348. Proposed by Jiang Huanxin, student, FuDan University, ShangHai, China. 


Let D, E, F be distinct points on the sides BC, CA, and AB respectively of 
A ABC. Let a = Z BDF, B = Z FDA, y = Z ADE, and 6 = Z EDC. If AD, BE, 
and CF are concurrent and a/B =6/y =m (m #¥ 1), prove that a =6 and 


B=y. 


10349. Proposed by Raphael M. Robinson, University of California, Berkeley, CA. 


The hyperbolic plane is tiled with equilateral triangles meeting seven at each 
vertex. Can the tiles be colored with seven colors in such a way that no two tiles of 
the same color meet, even at a vertex? (This problem was suggested to the 
proposer by David Gale.) 


10350. Proposed by Borislav Lazarov, Sofia, Bulgaria. 


Let M be a set of positive integers. Let P,, be set of all primes that divide 
elements of M, and let L,, be the set of elements of M having no proper divisor 
in M. Show that P,, finite implies L,, finite. 


10351. Proposed by Leopold Flatto and Jeffrey C. Lagarias, AT & T Bell Laborato- 
ries, Murray Hill, NJ. 


Consider the random power series 
f(t) = Lome", 
n=0 


where the 7, are drawn independently from {—1, 1}, with the probability of 7; = 1 
being p for all i. 

(a) If p = 1/2, show that f(t) has infinitely many zeros in the interval (0, 1) 
with probability one. 

(b) What happens if p # 1/2? 


10352. Proposed by Yves Nievergelt, Eastern Washington University, Cheney, WA. 


Let U be an open subset of R” with smooth boundary dU contained in a ball of 
radius R. 

(a) For n = 3, show that Vol(U) < R- Area(aU) /3. 

(b) Generalize to arbitrary dimensions n. 
10353. Proposed by Barry Powell, Kirkland, WA. 


Show that, for any odd prime p, there do not exist non-zero integers, x, y, z 
satisfying 


(x,y) =1 ptxy x®©+y=z?P, 
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NOTES 


Notes: (10347) A weaker version of this appeared as Problem A2 on the 19 
Annual William Lowell Putnam Mathematical Competition (November 1958). 
(10349) Since seven triangles meet at each vertex, the angles in the triangles are all 
27/7. The sum of the angles in each triangle is less than a as required in a 
hyperbolic plane. (10353) See P. Ribenboim, Thirteen lectures on Fermat’s Last 


Q 
Theorem, especially pp. 67-68 where the Jacobi symbols [o are evaluated, with 


i 
Q, = O,(a,b) = (b? — a?)/(b — a) with a and b odd, relatively prime, and a = b 
(mod 4). Other MonTuLy problems dealing with variations of the Fermat equation 
are E2771 [1979, 308; 1980, 407] and 6558 [1987, 884; 1990, 434]. 


SOLUTIONS 


Periodicity in Multiplicative Groups 


6658 [1991, 445]. Proposed by L. Van Hamme, Free University of Brussels, Belgium. 


Define a sequence of integers by 


n—-1 


a(0O) = 1, a(n) = ("Ja(r) for n > 1, 


r=0 


so that L*_,a(n)x"/n! = (2 — e*)~' for |x| < log2. (This is sequence 1191 in 
N. J. Sloane’s Handbook of Integer Sequences, New York, Academic Press, 1973.) 
Prove that if p is a prime number and mi is an integer not divisible by p, then 


a(mp* +s) =a(mp*~' +5) (mod p*) 
for k a positive integer and s a nonnegative integer. 


Solution I by the proposer. Let RL. X] be the set of all real polynomials considered 
as an R-vector space and define a linear map 


o: R[X] > Q by 6(X") =a(n) forn =0,1,2,... 
Apply ¢ to the identity (¥ + 1)” = r,(")x ", Then, for n > 1, 


n—-1 


6((X+1)")=a(n) +E (" Jar) = 2a(n) = 26(X"). 


Hence, for any polynomial pCX), 
b( p(X + 1)) = 26( p(X)) — pO). 
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Taking for p(X) the polynomial (* ), and using the relation (* * : ] = (* + x 


for all r > 1, we get 
o((*)] = 4((, *)] r>1. 


Thus, ¢ sends (* to 1 for r = 0,1,2,... . If a polynomial p(X’) takes only integer 


values for X = 0,1,2,..., then p(X) is of the form p(X) = r,c,{*] with c, € Z, 
and hence ¢( pCX)) is an integer. Now apply this observation to the polynomial 


ym ts _ yp 'ts 


p(X) = p* 


Since a? =a? (mod p*) for all integers a, this polynomial is integer-valued; 
hence 
a(mp* + s) — a(mp*~! +s) 
6( p(X)) = pé 


is an integer, as required. 


Solution II by Jens Schwaiger, Universitat Graz, Graz, Austria. Since 


E a(n) = (2 et) = (1=(e = D)= E Cer)” 
and since 
(er-1i ee 
a LST 


where S(/, j) denotes the Stirling number of the second kind given by 
12 i{J\,.. 
sh = 5 Ev {a= a) 
J* j=0 l 


(cf. Louis Comtet, Advanced Combinatorics, D. Reidel, 1974, pp. 204—206) we get 
N(n) 


a(n) = Ysis(n,/) = L j!S(n, 7) 


where N(n) is any integer greater than or equal to n. 
Putting n, = mp* +s and choosing N(n,) = N(n,_,) =n,, we thus get 


a(n) ~ amar) = EEC)" (G- 9" = G9") 


yielding the desired result as in Solution I. 


Editorial comment. The solutions show that the condition that p + m in the 
statement is not required. 

A related use of the operator ¢@ of Solution I can be found in Gian-Carlo Rota, 
“The number of partitions of a set,’’ this MONTHLY, 71 (1964), 498-504. 


Solved also by D. Callan, E. Dobrowolski (Canada), O. P. Lossers (The Netherlands), R. Richberg 
(Germany), and C. Vanden Eynden. 
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An Absorbing 4-Digit Number 


10194 [1992, 161]. Proposed by Jiro Fukuta, Gifu-ken, Japan. 


(a) For any four-digit number x in base 12, excluding the eleven numbers with all 
digits equal, form the number A = a,a,a,a, obtained by arranging the four digits 
in descending order of magnitude. Next form the number B = a,a,a,a, obtained 
by exchanging the first two with the last two digits. Put K(x) =A —B and 
K'*(x) = K(K(x)) for i = 1,2,... . Prove that K‘(x) = 4378 if i> 5. 

(b) Generalize to the base 3: 2”, nm = 0,1,2,.... 


Solution by Robin J. Chapman, University of Exeter, United Kingdom. When 
giving a number by digits, we surround it by parentheses, with commas between 
digits if needed for clarity. Replacing 12 by b = 3-2”, we prove that K(x) = 
(2”7,2" —1,2"t! —1,2"+t!) if i> 2n + 3 and x does not have equal digits. 

We first prove K(x) has the form (a,8B,b-—1-—a,b-—1- 8B), with 0< 
a, B <b. Since (b — 1 — a,b — 1 — B), = (b* — 1) — (aB),, the four-digit form 
specified equals (b* — 1)[(a@B), + 1]. By the definition, 


K(x) = (b? - 1)[(4a,42), — (a344),|. 


Hence we set (a8), = (a,a,), — (a,a,), — 1 to complete the claim. This guaran- 
tees that K(x) does not have all digits equal if n > 0, since that requires 
a = b — 1 -—a, but b is even. When b = 3, one can have K(x) = (1111), but only 
when A = 2210. The 12 values of x base 3 having this A must also be excluded. 

We next prove K7(x) has the form(y, y — 1,5 -1—y,b — y), with1 < y <b, 
which we call N(y). By the first claim, the digits of K(x) consist of two pairs 
summing to b — 1; let them be (c,c,c3c,) when put in descending order. Then 
K*(x) = (a’, B’,b -1-—a',b — 1 — B’), where (a’B’), = (c,c,), — (€3¢4), — 1. 
This sets K*(x) = N(y) with y = c, — c3. 

Now we compute K(My)) for 1 < y < b. If y = (b + 1)/2, then in descending 
order the digits of N(y) are y,y — 1,b — y,b — y — 1, from which we compute 
K(Ny)) = NQy — b). If y < (b — 1)/2, then in descending order the digits of 
N(y) are b— y,b —y — 1, y, y — 1, from which we compute K(N(y)) = N(b — 
2y). Finally, if b > 3 we may have y = D/2, in which case the digits of N(y) are 
b/2,b/2,b/2 — 1,b/2 — 1 and K(NMy)) = NQ). Summarizing, we have 
K(Ny)) = N(f(y)), where 


fi ify =b/2 
f(y) = i —bl ify #b/2. 


Note that f(2”) = 2”. 

We now claim that f?"*+'(y) = 2” for all y with 1 < y <b, from which the 
result follows immediately. This is trivial if b = 3, so we may assume n > O. If 
y € {2",b/2,2”*'}, then f(y) is divisible by more factors of 2 than y is. We reach 
f'(y) =b/2 for some r<n-—1 or f'(y) € {2",2"*1} for some r <n. Since 
f(2"*") = fQ") = 2", it suffices to show f”*7(b/2) = 2”. This follows by direct 
computation, since f(b/2) = 1 and f’(1) =3 2” —2' forl<t<n+1. 

One can give examples where 2” +3 iterations are needed. For n > 0, 
let x =(b/2 + 2,b/24+1,0,0). Then K*(x) = N(@). Since f7"(3) = 2"*}, 
K2"*2(y) # N(2”). Hence the bound i > 5 in the statement of part (a) is incor- 
rect; i > 7 is needed. 
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Editorial comment. A. Tissier and S. Sagong noted that in bases other than 2 or 
3-2” this iteration has no fixed point. 


Solved also by J. C. Binz (Switzerland), L. Coutry (Egypt), M. Dindos (Slovakia), F. H. Kierstead, 
Jr., S. Sagong, R. Stong, National Security Agency Problems Group, and the proposer. 


Cutting a Parameterized Circle in Half 


10198 [1992, 162]. Proposed by David M. Bloom, Brooklyn College of CUNY, 
Brooklyn, NY. 


Suppose f is a continuous map of [0,1] onto a circle. Prove that there exist two 
closed subintervals of [0, 1] intersecting in at most one point whose images under f 
are complementary semicircles (i.e., semicircles intersecting only at their end- 
points). 


Solution by Richard Stong, Rice University, Houston, TX. View the circle as 
R/Z. Since [0,1] is simply-connected, f lifts to a continuous map g: [0,1] > R. 
Let a be the maximum value that g attains and let b be a point where g(b) = a. 
Since f is onto, g must attain values arbitrarily near a — 1. Therefore, since g is 
continuous there must be some point e with g(e) = a — 1. Assume for definite- 
ness that e > b. Let c and d be respectively the smallest and largest values in 
[b,e] for which g(x) = a — 1/2. Then [b,c] and [d, e] are the desired intervals. 


Solved also by K. F. Andersen (Canada), D. W. Bailey, W. H. Beckmann, F. Brulois, R. J. Chapman 
(U.K.), K. S. Kedlaya (student), Y.-H. Kiem (student, Korea), R. Martin (student), A. Miller (France), 
A. Nijenhuis, N. Passell, B. Richmond, A. Riese, S. T. Stefanov (Bulgaria), E. Suarez (Spain), J. Vogel, 
T. Zeanah & E. G. Katsoulis, Northern Kentucky University Problem Group, and the proposer. One 
incorrect solution was received. 


Just Below the Graph of 1 / (1 — x) 


10209 [1992, 266]. Proposed by Feng Hangiao, Shaanxi Normal University, Xian, 
China, and Siu-Ah Ng, University of Hull, Hull, England. 


For each non-negative integer k, define a,(n) for non-negative integers n by 


1 
a,(0)=1 and a,(i+1)= a,(i)|1 + pauti)| (i > 0). 
Find sup,, a,,,(”) for m = 1,2,.... 


Solution by Reiner Martin (student), University of California, Los Angeles, CA. 
We will show that 


oe) ; for m = 1, 
sup d,,,,(7) — an form > 1. 
n m—1 


These expressions follow from the inequalities 


k kn k 
kon (k-np =) S Raa (1) 


The right inequality is valid when k >n > 0; the left inequality requires the 
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additional condition that k > n + vn. Given (1), set k = mn to obtain the result 
for m > 1. Since a,(n) is clearly an increasing function of n for fixed k, the result 
for m = 1 will follow from the fact that the left side is unbounded as a function of 
nand k with k > n. 

We prove (1) by induction on n, the case n = 0 being trivial. For the right side, 
if k >n + 1, the inductive step is 


1 1 1 k 1 1 sk k 
= —_ — <<. 
a,(n + 1) a,(n)| + Za(n)) < + p—a}*s a 


where the rightmost inequality follows from (k —n — 1Xk —n + 1) < (k —n)’. 
Denote the left side of (1) by f(k, n). In order to use 


a ") | < ax(n)(1 + tt) | =a,(n +1) 


in the inductive step, we demand f(k, n) = 0, which is guaranteed by k > n + vn. 
This being so, we then wish to show that 


1+ 


f( k,n) 


f(k,n) 
f(k,n)\1 + {| — f(k,n + 1) 
is positive. This is easily done using a computer algebra package. Multiplying this 
expression by (k — n)°(k — n — 1)° yields k times an expression which becomes 


n71> + nl? + 8nl* + 17nl? + 15nl”2 + 6nl +n + 2P + 914 + 162° + 14/7 + 6141 


on substituting k = n + 1 +1. Since this is a polynomial with positive coefficients, 
the result follows. 


Editorial comment. By various means, most solvers related sup, a,,,, to 
ve ox” = 1/(1 — x). Christopher P. Grant and Thomas Kunkle did so by noting 
that the sequence a,(i), 0 <i < k is the approximation to the solution of y’ = y?, 
y(0) = 1 on [0, 1) generated by Euler’s method with step size 1/k. 


Solved also by R. J. Chapman (U.K.), C. P. Grant, T. Kunkle, O. P. Lossers (The Netherlands), 
R. Stong, and the proposers. 


REVIVALS 


Homeomorphisms of Compact Metric Spaces 


6612 [1989, 846; 1991, 663]. Proposed by Ebrahim Salehi, University of Nevada, Las 
Vegas, NV. 


Suppose X is a compact metric space with metric d, and suppose T: X — X is 
continuous. If 


inf d(T"x,T”y) > 0 
neNn 


for each pair x, y of distinct elements of X, prove that T is onto. 
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Editorial comment. Shortly after the original publication of a solution, David B. 
Ellis, Ebrahim Salehi, and John Henry Steelman provided counterexamples to the 
claim, made in that solution, that 


d'(x,y) = inf d(T"x, Ty) 


is a metric. For example, if X consists of the three real numbers 0,1, x with 
0 <x < 1/2, using the metric induced from R, and 7 interchanges 0 and 1 while 
fixing x, then d’'(0,1) = 1 > 2x = d'(0, x) + d'(x, 1). Deeper constructions appear 
to be needed to solve the problem. The following solution is based on the idea of 
the enveloping semigroup. The enveloping semigroup and related notions have 
proven to be extremely valuable in topological dynamics (see references). The 
previous argument claimed to work even when T is not assumed continuous. It is 
still open to decide if the assumption of continuity is required. 


Solution by David B. Ellis, Beloit College, Beloit, WI. In order to define our 
semigroup of functions, we consider the set X* = {f: X — X}, of all self maps of 
X. Note that X* is a semigroup under composition. We give X* the topology of 
pointwise convergence, so that 


f, ~f of,.(*) ~f(*) forevery x € X. 
This makes X* a compact Hausdorff space. By analogy to the enveloping semi- 
group of (X,T), we form the closure in X~ of the strictly positive iterates of T: 
E(X,T) ={T,T?,...,T",...} CX*. 


Our solution requires two lemmas concerning E(X, T). The first lemma is an 
immediate consequence of the assumption that T is continuous and the fact that 
we have given X~ the topology of pointwise convergence. 


Lemma 1. Let X be a compact Hausdorff space and T: X — X be continuous. Then 


(a) the function L;: X* — X* defined by L;(p) = T° p is continuous, 

(b) the function R,: X* — X~ defined by R,(q) = 4° p is continuous for every 
pe x%, 

(c) ECX, T) is a subsemigroup of X~. 


Lemma 2. Let S be a compact Hausdorff space with a semigroup structure in which 
R,, defined as in Lemma 1(b), is continuous for every p € S. Then S contains an 
element u with u* = u. 


Proof: We use a Zorn’s lemma argument. Let 
M={M <S|@ # M isclosed and M? c M}. 


Note that S <4 so @ is nonempty. If {M,|a € A} is a descending chain of 
elements of .4, then 


M= ()M,€4 


acAd 


is an infimum. Applying Zorn’s lemma we get a minimal nonempty element 
Ne 4. Let u EN. Then R,(N) = Nu is a compact, hence closed, subset of S. 
Since (Nu)(Nu) = (NuN)u <c Nu CN, it follows that Nu =N because N is 
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minimal. Now set 
Q = {uv €Nlvu =u} = R,'({u}) ON. 


Q is nonempty because u € N = Nu; Q is closed because R,, is continuous. 
Moreover (v,v,)u = v,(v,u) = v,u =u for any v,,v, EQ; thus Q* =Q. The 
minimality of N implies that QO = N. In particular u € Q so that u? = u. 

We now show how the desired result follows from these two lemmas. 

Since X is compact, the image of T is closed. Thus it suffices to show that the 
image of 7 is dense. To this end, choose x © X and let U be any open 
neighborhood of x. We will show that U intersects the image of T. 

By the lemmas, we can find an idempotent u © ECX,T). In particular 


u(u(x)) =u(x). 
Now u is a limit point of the strictly positive iterates of J in the topology of 
pointwise convergence. Thus for any neighborhood V of u(x) there exists n > 0 
such that 


T"(u(x)),T"(x) € V. 


The assumption that inf d(T"x,T”y) > 0 when x #y implies that x = u(x). 
Taking V = U we have T(x) € U, and hence U intersects the image of T. 


REFERENCES 


1. J. Auslander, Minimal Flows and their Extensions, North Holland, Amsterdam, 1988 
2. D. Ellis, “What does Topological Dynamics have to do with Algebra?”, preprint 
3. R. Ellis, Lectures on Topological Dynamics, Benjamin, New York, 1969 


Extraneous Primes 


E 3452 [1991, 645]. Proposed by C. A. Nicol and J. L. Selfridge, University of South 
Carolina, Columbia, SC. 


If n is an odd integer greater than 3 and ¢ is the Euler function, prove that 
there exists a prime p such that p|(2®™” — 1) but p +n. 


Editorial comment. Gerry Myerson has pointed out that an extension of the 
result was misstated, and two values were omitted from what was claimed to be a 
“complete list”. The items listed are the set of pairs a,n such that (a,n) = 1, 
a >1andn > 2 for which there is no prime p such that p|(a®” — 1) but p +n. 
The complete list of such (n, a) is {@, 2), (4,3), (6,2), (6, 3), (6,5), (6, 7), (6, 17), 
(10, 3)}. 


Source-even Orientations of Graphs 


E 3462 [1991, 755; 1993, 594]. Proposed by J. J. Rotman, University of Illinois at 
Urbana-Champaign, IL. 


Prove that any connected simple graph with an even number of edges has an 
orientation (assignment of direction to each edge) such that the number of edges 
leaving each vertex is even. 


Editorial comment. Fred Galvin has pointed out that the word “connected” was 
omitted from his result for infinite graphs. A correct statement is given below. 
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Let G be a connected infinite graph and let V, be the set of vertices of finite 
degree. Then, for any mapping p: V, — {0,1}, there is an orientation of G such 
that, for every vertex uv € V,,, the number of edges leaving uv has the same parity as 
pv). 

If the graph G is allowed to have finite components, it is easy to construct 
counterexamples. In particular, one can take an infinite number of disjoint copies 
of the graph consisting of two vertices joined by a single edge, with p(v) = 0 for all 
U. 


Collaborating editors: David F. Appleyard, Paul T. Bateman, Bruce C. Berndt, 
Duane M. Broline, Barry W. Brunson, Frank S. Cater, Gulbank D. Chakerian, 
Underwood Dudley, Gerald A. Edgar, Michael A. Filaseta, Ira M. Gessel, Richard 
A. Gibbs, Jerrold R. Griggs, Douglas A. Hensley, John R. Isbell, Mourad E. H. 
Ismail, Murray Klamkin, Daniel J. Kleitman, Frederick W. Luttmann, Frank B. 
Miles, Richard Pfiefer, Stephen L. Portnoy, J. O. Shallit, John Henry Steelman, 
Kenneth B. Stolarsky, David E. Tepper, Douglas B. Tyler, Daniel Ullman, and 
William E. Watkins. 


List of referees and guest editors for 1993: Paul T. Bateman, Gilbert Baumslag, 
Jozsef Beck, John Brillhart, Ezra Brown, Barry W. Brunson, David Cantor, Bille C. 
Carlson, Frank S. Cater, Gulbank D. Chakerian, David A. Cox, Dennis DeTurck, 
John Duncan, Gerald A. Edgar, Noam Elkies, Michael A. Filaseta, Peter C. 
Fishburn, Dan Flath, Ira M. Gessel, Richard A. Gibbs, Bart Goddard, Sheldon 
Goldstein, Robert Louis Griess Jr., Branko Griinbaum, Leonid Gurvits, Richard 
K. Guy, Douglas A. Hensley, John R. Isbell, Mourad E. H. Ismail, Paul Kainen, 
Geoffrey A. Kandall, Murray S. Klamkin, Janos Komlos, Martin Kruskal, Peter S. 
Landweber, Solomon Leader, Frederick W. Luttmann Jr., Richard N. Lyons, 
Frank B. Miles, Paul Monsky, Benjamin Muckenhoupt, Ram M. Murty, Roger 
Nussbaum, Beresford N. Parlett, M. J. Pelling, Richard E. Pfiefer, Robert W. 
Prielipp, Carl Pomerance, Stanley Rabinowitz, Mizanur Rahman, Doris 
Schattschneider, Peter Scott, Lawrence A. Shepp, Joseph Silverman, Kenneth B. 
Stolarsky, David E. Tepper, Jerrold Tunnell, Douglas B. Tyler, Daniel Ullman, 
William C. Waterhouse, William E. Watkins, Jeffrey R. Weeks, Gregory P. Wene, 
Douglas West, Cunhui Zhang. 


Institute For Advanced Study 


In describing the new Institute for Advanced Study at Princeton, Professor 
_ Veblen said that a few years ago Mr. Bamberger decided to devote his wealth 


_ to some useful purpose and through the influence of Mr. Abraham Flexner 
_ decided to devote it to a project for the furtherance of pure scholarship. The 
_ plan contemplates a small group of mathematicians who will be free to do 
cientific work involving no bestowal of degrees, large liberty being allowed 
the professors in conducting their activities in the form of seminars or 
_ formal lectures or none, as they may wish. It is expected that the students will 
be beyond the stage of the usual graduate student and that mathematicians 
vill come to the Institute for limited periods of time for the purpose of doing 

| some particular piece of work, for writing a book, etc. 


—American Mathematical Monthly 
40, (1933) p. 128 
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